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Introduction 


T 

I    his  Technical  Report  is  intended  to  be  used 
I    as  a  basis  for  training  evaluator-practition- 
er  teams  to  collaborate  in  the  design  and  imple- 
mentation of  useful  evaluations.  It  is  organized 
around  the  idea  that  a  useful  evaluation 
depends  on  prevention  evaluators  working 
closely  with  practitioners  to  design  feasible 
evaluations.  Practitioners  should  then  support 
evaluators  in  their  efforts  to  assess  accurately 
the  nature  and  effects  of  prevention  programs. 
This  document  presents  both  general  materials 
on  how  evaluation  fits  into  the  practice  of  pre- 
vention and  technical  details  of  implementing 
an  evaluation  plan.  The  combination  will  give 
prevention  staff  common  ground  on  which  to 
build. 

•  Chapter  1  of  this  guide  asserts  that  useful 
evaluations  are  tailored  to  intervention  type 
and  stage  of  development.  This  information 
will  be  useful  to  preventionists  as  well  as 
evaluators. 

•  Chapter  2  describes  types  of  evaluation  and 
evaluation  outcomes.  It  also  discusses  the 
problem  of  credibly  attributing  observed 
effects  to  the  prevention  intervention.  Prac- 
titioners and  evaluators  alike  should  be 
familiar  with  this  basic  information. 

•  Chapter  3  outlines  several  experimental 
and  quasi-experimental  evaluation  designs. 
Though  of  limited  value  to  experienced 
evaluators,  the  information  will  be  useful  to 
prevention  administrators  who  may  lack 
understanding  of  various  research  designs. 


A  Guide  for  Evaluating  Prevention  Effectiveness 


Chapter  4  discusses  critical,  practical  issues 
that  should  be  addressed  before  beginning 
data  collection.  Evaluators  inexperienced  in 
applied  research  should  review  these  rec- 
ommendations thoroughly.  Practitioners 
will  find  this  chapter  useful  because  it 
details  the  rights  and  responsibilities  of 
program  personnel  and  clients  in  the 
evaluation  process. 

Chapter  5  summarizes  basic  quantitative 
and  qualitative  methods  of  evaluation.  Both 
practitioners  and  evaluators  should  be 
famiUar  with  these  methods. 


Chapter  6  concerns  concepts  in  data  analy- 
sis with  which  prevention  administrators 
should  be  familiar.  It  notes  how  some  types 
of  data  analyses  are  better  suited  to  specific 
types  of  research  questions  and  different 
stages  of  program  development. 

Chapter  7  draws  conclusions  and  makes 
recommendations . 

The  appendix  offers  a  list  of  resource  books 
on  evaluation.  The  glossary  that  follows  the 
appendix  will  be  particularly  useful  to  prac- 
titioners unfamiliar  with  evaluation  terms 
and  processes. 


Chapter  1 


Intervention  Types  and  Their  Stages  of 
Development 


In  order  to  obtain  meaningful  information 
about  any  prevention  intervention,  evaluators 
must  select  methods  that  are  appropriate  for  the 
type  of  intervention  they  are  evaluating.  They 
must  also  ask  evaluation  questions  that  corre- 
spond to  the  intervention's  stage  of  develop- 
ment. Mismatch  between  these  aspects  of  the 
intervention  and  its  evaluation  will  produce 
friction  and  may  make  both  the  intervention 
and  the  evaluation  less  useful. 

Classifying  an  intervention  according  to 
type  and  stage  of  development  is  the  first  step 
in  tailoring  evaluation  methods  to  the  needs  of 
the  program.  This  chapter  addresses  this  critical 
first  step.  It  begins  with  a  brief  discussion  of 
two  basic  prevention  approaches — individual- 
oriented  interventions  and  policy  interventions. 
It  concludes  by  elaborating  on  the  concept  of 
intervention  stage  of  development,  providing 
sample  questions  that  are  appropriate  for  each 
stage,  and  listing  examples  of  research  methods 
that  are  often  used  in  particular  stages. 

Approaches  to  Prevention 

Traditionally,  the  prevention  field  has  empha- 
sized individual-oriented  approaches  to  preven- 
tion. After  all,  smoking  tobacco,  drinking 
alcohol,  and  taking  drugs  are  individual  behav- 
iors. It  makes  sense  to  intervene  with  individu- 
als. In  fact,  interventions  with  individuals  are 
the  most  prevalent,  and  they  are  relatively  easy 
to  implement.  Frequently  these  types  of  inter- 
ventions provide  services  such  as  employee 
assistance  programs  and  afterschool  alterna- 
tives programs.  Other  interventions  attempting 
to  change  individual-level  behavior  include 
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school-based  education  programs  and  "drive 
safe"  media  campaigns. 

There  is  growing  research  evidence,  howev- 
er, that  policies  can  change  the  environment  so 
as  to  change  individual  behavior  and  to  reduce 
substance-related  problems.  Policy  interven- 
tions may  be  defined  as  those  affecting  the 
social,  economic,  and  regulatory  environment 
around  substance  use.  They  attempt  to  change 
the  environment  in  such  a  way  that  substances 
become  less  available  and /or  more  expensive. 
Policy  interventions  also  change  norms,  values, 
and  expectations  so  that  they  are  less  support- 
ive of  substance  use  that  results  in  health  and 
social  problems.  Policy  interventions  include 
formal  changes  in  laws  at  the  Federal,  State,  or 
local  level,  as  well  as  changes  in  institutions 
(e.g.,  schools,  law  enforcement  agencies,  retail 
establishments,  families). 

Interventions  that  try  to  reduce  the  use  of 
alcohol,  tobacco,  and  illicit  drugs  by  focusing 
entirely  on  individual  knowledge,  attitudes, 
beliefs,  and  behavior  produce  less  long-term 
success  in  decreasing  substance-related  prob- 
lems than  do  policy  approaches  (Bangert- 
Drowns,  1988;  Clayton,  Cattarello,  Day,  & 
Walden,  1991;  Moskowitz,  1989;  Ringwalt, 
Ennett,  &  Holt,  1991;  Rosenbaum,  Flewelling, 
Bailey  Ringwalt,  &  Wilkinson,  1994;  Rundall  & 
Bruvold,  1988;  Tobler,  1992).  h^dividually 
focused  prevention  approaches  show  the  most 
promise  when  implemented  in  the  context  of  a 
larger  community  change  effort  (e.g..  Holder  et 
al.,  1997;  Pentz  et  al.,  1989;  Perry  et  al.,  1996; 
Wagenaar,  Murray,  Wolfson,  Forster,  & 
Finnegan,  1994). 

Ultimately,  policy  (or  environmental) 
approaches  anticipate  that  individual  decisions 
to  use  substances  will  change.  Thus,  the  dis- 
tinction between  individual  and  policy 
approaches  is  not  always  clear-cut.  For  exam- 
ple, sometimes  law  enforcement  strategies  to 
deter  substance-related  behavior  (e.g.,  sobriety 
checkpoints)  are  considered  an  individual 
approach  because  they  are  intended  to  per- 
suade individuals  not  to  engage  in  problem 
behaviors.  On  the  other  hand,  these  strategies 
can  be  seen  as  one  way  of  establishing  an  envi- 
ronment in  which  these  problem  behaviors  are 
clearly  unacceptable. 


Table  1  presents  examples  of  interventions 
aimed  at  reducing  problem  behavior  through 
changes  at  the  individual  level.  Table  2  provides 
examples  of  policy  strategies  for  reducing  sub- 
stance use  and  related  problems  in  the  general 
population. 

Individual  and  policy  interventions  require 
somewhat  different  forms  of  evaluation,  as  will 
be  discussed  in  the  chapters  that  follow. 
Regardless  of  intervention  type,  however,  eval- 
uation involves  the  collection  of  data  in  ways 
that  allow  evaluators  and  preventionists  to 
assess  and  improve  the  ways  in  which  the  inter- 
vention is  conducted.  These  methods  are  fairly 
well  understood,  but  the  specific  outcomes 
measured,  study  samples  selected,  and  statisti- 
cal analyses  employed  are  likely  to  be  more 
similar  within  intervention  type.  Evaluations 
aimed  at  changing  individuals  will  measure  the 
attitudes,  beliefs,  and /or  behaviors  of  the  indi- 
viduals exposed  to  the  intervention.  In  contrast, 
evaluations  of  policy  interventions  will  measure 
community-level  and  systems-level  changes. 

Some  of  the  strategies  listed  in  tables  1 
and  2  have  extensive  evidence  of  effectiveness. 
Research  has  not  yet  established  the  level  of 
effectiveness  for  others.  It  is  clear  from  inspect- 
ing table  2,  however,  that  strategies  for  dealing 
with  illicit  drugs  are  far  less  available  and  less 
well  developed  than  are  strategies  relevant  to 
alcohol  and  tobacco.  One  reason  for  this  dispar- 
ity is  that  because  alcohol  and  tobacco  are  legal 
substances,  there  are  many  more  opportunities 
for  reducing  consumption  and  problems 
through  regulatory  approaches.  Nonetheless, 
policy  approaches  to  illicit  drugs  other  than 
conventional  law  enforcement  efforts  are 
emerging,  and  the  knowledge  base  regarding 
the  effectiveness  of  these  approaches  is  growing 
(Bureau  of  Justice  Assistance,  1993;  Davis  & 
Lurigio,  1996;  Green,  1996;  Roehl,  Wong, 
Andrews,  Huitt,  &  Capowich,  1995). 

Stages  of  Development 

In  addition  to  recognizing  the  type  of  interven- 
tion approach,  good  evaluation  methodologies 
consider  stage  of  development.  As  prevention 
efforts  mature,  the  questions  posed  by  the  eval- 
uation and  the  methods  used  to  address  them 
will  change.  Evaluations  during  the  earlier 
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Table  1.  Individual-Oriented  Interventions  To  Reduce  Substance  Abuse 


School-Based 

•  Resistance-skills  training  program  designed  to 
increase  youths'  ability  to  withstand  the  pressure 
or  temptation  to  use  alcohol,  tobacco,  or  drugs 

•  Skills-building  program  increasing  social  and 
academic  abilities 

•  Alcohol,  tobacco,  and  drug  educational  program 
teaching  students  about  the  dangers  and  risks 
associated  with  use  and  fostering  a  more 
accurate  perception  of  norms 

Church-Based 

•  Group  discussion  or  support  sessions 
centered  around  personal  difficulty 
management  (e.g.,  stress,  grief, 
divorce) 

•  Series  of  sermons  on  the  dangers  of 
smoking 

•  Testimony  during  church  services  from 
members  giving  up  the  use  of 
substances 

Work-Based 

•  Employee  assistance  (EA)  program  instructing 
employees  how  to  handle  stress  at  work  and 
at  home 

•  EA  program  teaching  employees  about  the  risks 
and  dangers  associated  with  abuse 

•  EA  program  enabling  parents  to  prepare  their 
children  for  self-care  through  discussions  about 
self-care  issues  (safety,  decisionmaking, 
communication,  substance  abuse) 

Community-Based 

•  Afterschool  program  for  latchkey 
children 

•  Parent-to-parent  network  helping  all 
parents  in  a  community  to  supervise 
children 

•  Mentoring  program  exposing  youths 
to  positive  adult  role  models  and 
encouraging  high  academic  and 
professional  aspirations 

Family-Based 

•  Skills-building  and  educational  program  for 
parents  and  youth  to  improve  communication 
and  increase  knowledge  about  substance- 
related  issues 

•  Parenting  skills-building  program  for  single 
and/or  young  parents  who  may  need  assistance 
dealing  with  needy  or  troubled  children 

•  Family  therapy  to  improve  communication  and 
attachment  in  families  of  delinquent  youths 

Health  Care-Based 

•  Training  of  health  care  providers  to 
detect  the  signs  and  symptoms  of 
substance  abuse 

•  Discussion  of  the  dangers  of  substance 
abuse  in  every  health  service 
encounter 

College-Based 

•  Discussion  groups  examining  advertisement 
and  promotional  strategies  of  the  alcohol  and 
tobacco  industries 

•  Support  groups  for  adult  children  of  alcoholics 
to  examine  issues  related  to  parental  alcoholism 
and  stressful  situations  both  on  campus  and  at 
home 

•  Discussion  of  the  dangers  of  substance  abuse 
in  every  nonacademic  counseling  encounter 

Other 

•  Driver  education  classes  that  include 
attention  to  the  risks  associated  with 
alcohol-  and  drug-impaired  driving  and 
penalties  for  driving  under  the  influence 

•  Court-ordered  educational  or  treatment 
programs  for  impaired  drivers 

•  Media-sponsored  "drive  safe"  campaigns 

•  Elderly-outreach  program  to  assist  with 
stressful  life  circumstances  (e.g., 
retirement,  grief,  living  alone) 
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Table  2.  Policy  Interventions  To  Reduce  Substance  Abuse  Problems 


Alcohol 

Public  Policies 

Organizational  Policies 

•  Excise  taxes                                           « 

>  Warning  posters  {businesses) 

•  Limits  on  liours  or  days  of  sale                < 

>  Restrictions  on  alcohol  advertisements  {media) 

•  Restrictions  on  density,  location, 

•  Restrictions  on  alcohol  use  at  work  and  work 

and  type  of  outlets 

events  {businesses) 

•  IVIandatory  server  training  and                 < 

>  Restrictions  on  sponsorship  of  special  events 

licensing 

{communities,  stadiums) 

•  Dram  shop  and  social  host  liability          < 

•  Police  walkthroughs  at  alcohol  outlets 

•  Restrictions  on  advertising  and               < 

•  Undercover  outlet  compliance  checks  {law 

promotion 

enforcement  agencies) 

•  Mandatory  warning  signs  and  labels       < 

'  Responsible  beverage  service  policies  {outlets) 

•  Restrictions  on  consumption  in                < 

>  Mandatory  checks  of  age  identification 

public  places 

{businesses) 

•  Restrictions  on  happy-hour  sales            « 

'  Server  training  {businesses) 

•  Prevention  of  preemption  of  local            < 

•  Incentives  for  checking  age  identification 

control  of  alcohol  regulation 

{businesses) 

•  Minimum  drinking  age                             < 

'  Restrictions  on  sales  to  those  accompanied  by 

•  Keg  registration  ordinances 

individuals  under  age  21  {businesses) 

•  Enhancement  of  drivers'  licenses            « 

•  Prohibition  of  alcohol  on  school  grounds  or  at 

(to  indicate  age  clearly  and  prevent 

school  events  {schools) 

fraud) 

•  Enforcement  of  school  policies  {schools) 

•  Ban  on  home  deliveries                           < 

•  Prohibition  of  beer  kegs  on  campus  {colleges) 

•  Compulsory  compliance  checks  for         < 

•  Establishment  of  alcohol-free  dormitories  and 

minimum  purchase  age  and 

campuses  {colleges) 

administrative  penalties  for                      « 

►  Establishment  of  enforcement  priorities  against 

violations 

adults  who  illegally  provide  alcohol  to  youth 

•  Establishment  of  minimum  age  for          « 

»  Sobriety  checkpoints  {law  enforcement  agencies) 

sellers                                                   « 

►  Media  campaigns  about  enforcement  efforts 

•  0.00  blood  alcohol  content  (BAG)  for 

{media) 

young  drivers                                          • 

>  Safe  ride  programs  {businesses) 

•  0.08  BAG  for  adult  drivers 

►  Identification  of  source  of  alcohol  consumed  prior 

•  Administrative  license  revocation  for 

to  driving-while-intoxicated  arrests  {law 

impaired  drivers 

enforcement  agencies) 
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Table  2.  Policy  Interventions  To  Reduce  Substance  Abuse  Problems  (continued) 


Tobacco 

Public  Policies 

Organizational  Policies 

•  Excise  taxes 

•  Establishment  of  smoke-free  settings  {restaurants, 

•  Tobacco  sales  licensing  system 

workplaces,  hospitals,  stadiums,  malls,  day  care 

•  Prohibition  of  smoking  in  public 

facilities) 

places 

•  Counteradvertising  (media) 

•  Prevention  of  preemption  of  local 

•  Restrictions  on  sponsorship  of  special  events 

control  of  tobacco  sales 

{communities,  colleges,  stadiums) 

•  Restrictions  on  advertising  and 

•  Prohibition  of  tobacco  use  on  school  grounds,  in 

promotion 

buses,  and  at  school  events  {schools) 

•  Ban  on  vending  macliines 

•  Enforcement  of  school  policies  (schools) 

•  Compulsory  compliance  checks  for 

•  Mandatory  checks  of  age  identification  (businesses) 

minimum  purchase  age  and 

•  Seller  training  (businesses) 

administrative  penalties  for  violations 

•  Incentives  for  checking  age  identification 

•  Minimum  age  of  sale  of  18 

(businesses) 

•  Warning  labels 

•  Undercover  shopper  or  monitoring  program 

•  Mandatory  seller  training 

(businesses) 

•  Ban  on  self-service  sales  (all 

tobacco  behind  the  counter) 

•  Minimum  age  for  sellers 

Other  Drugs 

Public  Policies 

Organizational  Policies 

•  Control  of  production  and 

•  Employer  policies  (businesses) 

distribution 

•  Surveillance  of  high-risk  public  areas  (law 

•  Zoning  and  building  codes  that 

enforcement  agencies,  neighborhood  watch  groups) 

discourage  drug  activity  and 

•   Enforcement  of  zoning  and  building  codes  (law 

penalties  for  property  owners  who 

enforcement  agencies,  building  authorities) 

fail  to  address  known  drug  activity 

•  Appropriate  design  and  maintenance  of  parks. 

•  Mandated  school  policies 

streets,  and  other  public  places  (e.g.,  lighting,  traffic 

flow)  (city  agencies,  housing  authorities) 

•  Enforcement  of  school  drug  policies  (schools) 

Note:  Institutions  that  can  develop  suggested  institutional  policies  are  indicated  in  parentheses. 
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phases  of  an  intervention  tend  to  focus  research 
questions  around  the  nature  of  its  implementa- 
tion. Such  evaluations  are  called  process  evalua- 
tions. Evaluations  during  later  phases  center 
research  questions  around  finding  evidence  of 
the  intervention's  effect  on  desired  substance 
abuse-related  outcomes.  Such  evaluations  are 
called  outcome  evaluations. 

Regardless  of  the  intervention's  stage  of 
development,  some  evaluation  questions  should 
anticipate  issues  beyond  the  immediate  concern. 
For  example,  even  though  a  new  prevention 
effort  may  be  in  its  infancy  and  most  evaluation 
questions  center  around  the  nature  of  imple- 
mentation and /or  service  delivery,  some 
research  questions  should  focus  on  anticipating 
and  detecting  initial  or  preliminary  outcomes. 
Similarly,  a  prevention  effort  in  later  stages  of 
development  (with  research  questions  focused 
around  outcomes)  should  continue  to  coUect 
data  about  the  nature  of  implementation  and /or 
service  delivery  in  order  to  detect  process  issues 
that  may  contribute  to  unanticipated  outcomes. 

Of  course,  the  stage  of  development  notion 
does  not  completely  capture  the  dynamic  nature 
of  intervention  development.  In  reality,  preven- 
tion efforts  are  highly  dependent  on  personnel, 
funding,  and  other  organizational  elements, 
and  development  is  not  always  sequential.  Pre- 
vention personnel  may  find  themselves  in  sev- 
eral stages  simultaneously.  A  given  step  may  be 
repeated  until  success  is  achieved,  or  it  may  be 
skipped  altogether.  An  awareness  of  this  gener- 
al developmental  course,  however,  can  be  use- 
ful in  tailoring  the  evaluation  appropriately  to 
the  stage  of  development. 

Differentiating  prevention  efforts  according 
to  their  stage  of  development  should  (1)  make 
expectations  for  the  short-term  accomplish- 
ments of  the  intervention  more  reasonable; 
(2)  provide  information  that  can  help  preven- 
tion efforts  move  further  along  in  their  develop- 
ment; (3)  reduce  evaluation  costs  by  limiting 
evaluation  activities  to  those  needed  to  answer 
the  questions  of  greatest  relevance;  and 
(4)  increase  the  usefulness  of  evaluation  activi- 
ties by  focusing  on  pertinent  questions. 

Table  3  lists  the  stages  of  intervention  devel- 
opment, provides  examples  of  the  types  of  eval- 


uation questions  likely  to  be  asked  at  each 
stage,  and  suggests  different  evaluation  meth- 
ods that  might  be  employed  to  answer  those 
questions.  This  classification  combines  ideas 
from  several  sources  (Florin,  Mitchell,  and 
Stevenson,  1993;  French  &  Bell,  1984;  Goodman 
&  Wandersman,  1994;  Gottfredson,  1984;  Kibel, 
1994;  National  Institute  on  Drug  Abuse,  1988, 
1991;  Sechrest  &  Figueredo,  1993). 

The  wide  array  of  qualitative  and  quantita- 
tive evaluation  methods  shifts  from  description 
and  exploratory  analyses  in  the  earlier  stages  of 
the  evaluation  to  more  comparative  and  confir- 
matory analyses  in  later  stages.  Thus,  a  mix  of 
evaluation  methods  is  required  and,  ultimately, 
most  useful.  Consequently,  practitioners  and 
evaluators  should: 

•  Select  evaluation  questions  appropriate  for 
the  intervention's  stage  of  development; 

•  Select  an  appropriate  mix  of  data  collection 
techniques;  and 

•  Assess  the  nature  of  implementation,  as 
well  as  its  effects  on  substance  abuse- 
related  outcomes. 

Summary 

This  chapter  suggests  that  appropriate  evalua- 
tion methodologies  stem  from  a  recognition  of 
an  initiative's  characteristics.  Evaluations  must 
be  suited  to  the  type  of  initiative  as  well  as  to  its 
stage  of  development.  Frequently,  evaluations 
will  be  most  useful  if  they  evolve  in  tandem 
with  the  prevention  effort  as  it  moves  through 
various  stages  of  development.  Process  evalua- 
tion questions  are  most  common  during  earlier 
stages  of  an  initiative.  Outcome  evaluation  ques- 
tions are  more  appropriately  asked  of  a  mature 
program.  However,  the  groundwork  for  a  rigor- 
ous outcome  evaluation  must  be  laid  earlier  in 
the  life  of  a  project.  It  is  therefore  necessary  to 
anticipate  and  plan  for  outcome  evaluation  well 
before  the  program  is  ready  for  this  level  of 
scrutiny.  The  immediate  information  needs  of  a 
developing  project,  however,  will  center  around 
issues  other  than  outcomes  during  earlier  stages 
of  development.  Similarly,  process  information 
about  program  implementation  should  be  col- 
lected throughout  the  life  of  the  program. 
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Table  3.  Evaluation  Questions  and  Data  Collection  Methods  Suited  to  Different 
Stages  of  Development 


Stage  of 
Development/Tasks 

Sample  Evaluation 
Questions 

Examples  of  Appropriate  Data 
Collection  Methods 

Initiation  Stage 

-  Mobilize  support 
and  resources 

-  Develop  intervention 
capacity 

-  Begin  documenting 
intervention's 
development 

-  Begin  thinking  about 
evaluating  the 
intervention 

What  is  the  level  of  interest  or 

concern  about  the  problem 

in  the  community? 
Who  is  involved  in  developing 

the  intervention? 
What  resources  (human  and 

other)  are  available  and 

needed? 
Is  there  sufficient  political 

support  for  this  effort? 

Community  survey  to  assess 
concerns 

Content  analysis  of  newspaper  and 
other  media  accounts 

Survey  of  participants'  perceptions 
of  "organizational  climate"  of  the 
group  and  disposition  for 
implementing  the  initiative 

Observation  of  the  community 
mobilization  process 

Organizational  analysis  to  assess 
organizations  and  persons 
participating  in  awareness  and 
mobilization  efforts,  noting 
breadth  of  community  sectors 
represented  and  number  of  key 
community  leaders  involved 

Planning  Stage 

-  Determine  nature 
and  extent  of 
problem 

-  Investigate 
alternative 
approaches 

-  Develop  theoretical 
framework 

-  Define  intervention 
goals 

-  Develop  initial 
implementation  plan 

-  Develop  preliminary 
evaluation  plan 

-  Begin  evaluation 

What  are  the  characteristics 

and  needs  of  the  target 

population? 
What  are  the  perceived 

causes  of  the  problem? 
What  is  the  cultural  context  in 

which  the  intervention  will 

operate? 
Who  is  involved  in  developing 

the  intervention? 
What  are  the  existing  policies 

that  address  the  problem? 
Are  there  specific  constituen- 
cies that  might  be  inclined 

to  support  policy  changes? 
What  prevention  interventions 

are  available  and  appear  to 

be  acceptable  to  the 

community? 
What  obstacles  need  to  be 

overcome  for  successful 

implementation? 
What  are  the  expectations  for 

the  intervention? 

Survey  and  review  of  archival  data 
on  the  nature  and  extent  of 
problems 

Survey  of  attitudes  and  beliefs 
about  the  targeted  problem  and 
the  proposed  approaches 

Analysis  of  similar  previous  or 
existing  prevention  efforts  in  the 
community 

Counts  of  organizations  and 
individuals  involved  in  the 
planning  process 

Content  analysis  of  initial  plan, 
including  attention  to:  (1)  the 
presence,  clarity,  and  reason- 
ableness of  theory  of  action; 
(2)  the  match  between  presumed 
causes  of  the  problem  to  be 
addressed  and  the  approaches  to 
reducing  problems;  (3)  the  level 
and  clarity  of  expectations  for  the 
activities;  (4)  anticipated 
obstacles  and  plans  to  overcome 
them;  and  (5)  numbers  and  types 
of  planned  activities 
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Table  3.  Evaluation  Questions  and  Data  Collection  Methods  Suited  to  Different 
Stages  of  Development  (continued) 


Stage  of 
Development/Tasks 

Sample  Evaluation 
Questions 

Examples  of  Appropriate  Data 
Collection  Methods 

Pilot-Testing  Stage 

-  Monitor  intervention 
to  detect  problems 

-  Solve  implementation 
problems 

-  Refine 
implementation  plan 

-  Refine  evaluation 
plan 

-  Continue  education 

What  are  participants'/clients' 
current  behavior,  attitude,  or 
understanding  about  the 
problem? 

What  resources  (anticipated 
and  unanticipated)  are  used? 

What  obstacles  are 
encountered? 

Are  organizational  processes 
and  systems  (e.g.,  com- 
munication, decisionmaking, 
lines  of  authority)  adequate? 

Is  the  intervention  implemented 
as  intended? 

What  are  clients'/participants' 
reactions  to  the  intervention? 

What  costs  are  associated 
with  the  intervention? 

Survey  to  assess  behavior, 
attitudes,  knowledge,  or  actions 
(or  similar  preintervention 
assessment) 

Analysis  of  records  on  recruitment 
efforts,  attendance  patterns,  and 
expenditures 

Observations  of  intervention 
context  and/or  service  delivery 

Focus  groups  with  policy  imple- 
menters,  service  deliverers,  and/ 
or  service  recipients  to  assess 
their  opinions  of  the  intervention 

Examination  of  implementation  logs 

Followup  interviews  with  program 
dropouts  to  ascertain  causes  of 
attrition 

Client  satisfaction  questionnaire 

Implementation  Stage* 

-  Monitor  intervention 
to  detect  problems 
and  immediate 
needs 

-  Monitor  support  for 
and  satisfaction  with 
intervention 

-  Monitor  intervention 
effects 

-  Monitor  capacity 

-  Refine  intervention 
(if  necessary) 

Is  the  intervention 

implemented  as  intended? 
What  resources  (anticipated 

and  unanticipated)  are 

used? 
What  obstacles  are 

encountered? 
Are  organizational  processes 

and  systems  adequate? 
What  costs  are  associated 

with  the  intervention? 
Are  staff  adequately  trained  to 

deliver  services  or 

implement  policy  changes? 
What  changes  in  behavior, 

attitudes,  knowledge,  or 

actions  are  evident? 
What  unanticipated  effects 

occur? 
What  are  clients'/participants' 

reactions  to  the 

intervention? 
What  is  the  overall  impact  on 

the  community  and/or  target 

population? 

Observation  of  intervention  context 

and/or  service  delivery 
Analysis  of  program  records  on 

recruitment  efforts,  attendance 

patterns,  and  expenditures 
Examination  of  intervention 

implementation  logs 
Focus  groups  with  service  providers 

or  policy  implementers  to  assess 

their  opinions  of  the  intervention 
Followup  interviews  with  program 

dropouts  to  ascertain  causes  of 

attrition 
Focus  groups  to  assess  changes  in 

attitudes  and  knowledge  as  well 

as  opinions 
Survey  to  assess  changes  in 

behavior,  attitudes,  knowledge,  or 

actions  (or  similar  postintervention 

assessment) 
Client  satisfaction  questionnaire 
Analysis  of  archival  records 

concerning  incidence  of  focal 

problem 
Community  survey  of  reactions  to 

the  intervention 
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Table  3.  Evaluation  Questions  and  Data  Collection  Methods  Suited  to  Different 
Stages  of  Development  (continued) 


Stage  of 
Development/Tasks 

Sample  Evaluation 
Questions 

Examples  of  Appropriate  Data 
Collection  Methods 

Stabilization  Stage* 

-  Demonstrate  value 
to  sponsors,  clients, 
and  community 

-  Enhance  capacity 

-  Defend  budgets  and 
resources  in  the 
face  of  competing 
claims  for  resources 

-  Assess  and  sustain 
intervention  effects 

Do  organizational  processes 

and  systems  remain 

adequate? 
Is  service  delivery  or  policy 

implementation  maintained 

and/or  improved? 
Are  new  costs  associated 

with  the  intervention? 
Do  staff  remain  qualified  to 

deliver  services  or 

Implement  policy 

appropriately? 
Are  new  changes  in  behavior, 

attitudes,  knowledge,  or 

actions  observed? 
Are  there  any  other 

unanticipated  effects? 
Are  positive  effects 

maintained  over  time? 
Are  benefits  worth  the  cost? 

Focus  groups  to  assess  changes  In 

attitudes  and  knowledge  as  well 

as  opinions  of  service  delivery 
Focus  groups  with  service 

providers  or  policy  implementers 

to  assess  their  opinions  of  the 

intervention 
Community  survey  to  assess 

reactions  to  the  intervention 
Analysis  of  intervention 

expenditures 
Assessment  of  generallzability  of 

findings 
Analysis  of  questionnaires, 

interviews,  observations,  archival 

records 

Dissemination  Stage 

-  Assess  projected 
needs 

-  If  considering 
expansion, 

•  expand  resources 

•  engage  In 
Initiation  Stage 
tasks 

Are  there  alternative 
approaches  or  additional 
policies  that  should  be 
explored  and  possibly 
implemented? 

Should  interventions  be 
targeted  to  new  areas? 

Is  expansion  worth  the  cost? 

Examination  of  documentation  of 

decisions  to  expand  the 

prevention  effort 
Assessments  of  needs  to  services 

provided 
Cost-benefit  analysis  of  monetary 

expenditures  to  documented 

positive  effects 

The  tasks  and  questions  associated  with  the  Implementation  and  Stabilization  Stages  are  not  exclusive  to  the  stage  and, 
once  posed,  tend  to  be  ongoing  throughout  the  course  of  the  program. 
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Chapter  2 


Frameworks  for  Evaluation 


Evaluations  should  provide  information  that 
enables  managers  and  policymakers  to 
make  timely  decisions  concerning  appropriate 
prevention  efforts.  This  chapter  discusses  the 
conceptual  building  blocks  necessary  for  con- 
ducting useful  evaluations.  It  begins  with  a 
brief  discussion  of  the  distinctions  between 
process  and  outcome  evaluations  and  between 
intermediate  and  long-term  outcomes,  then 
explains  the  concept  of  attribution  of  effect. 

Process  Versus  Outcome  Evaluations 

Although  there  are  multiple  approaches  to  eval- 
uation, comprehensive  approaches  involve  the 
systematic  gathering  of  information  about  the 
prevention  intervention's  operation,  as  well  as 
its  effects.  Thus,  evaluators  conduct  both 
process  and  outcome  evaluations  to  get  a  com- 
plete understanding  of  an  intervention.  Figure  1 
illustrates  how  the  proportion  of  process-  to 
outcome-oriented  data  collection  activities 
tends  to  shift  in  an  ongoing  evaluation.  Note 
that  process  evaluation  tends  to  dominate  in 
early  phases  but  continues  throughout  the 
course  of  the  evaluation.  The  emphasis  on  out- 
come evaluation  tends  to  increase  as  the  iater- 
vention  matures. 

Process  Evaluation 

Process  evaluation  is  ongoing  assessment  and 
documentation  of  the  planning,  development, 
and  implementation  phases  of  an  intervention. 
Largely  descriptive,  process  evaluation  can 
focus  on  numbers  and  characteristics  of  clients, 
services,  and  program  staff.  Process  evaluation 
can  also  allow  implementers  to  compare  inter- 
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Outcome-Oriented 
Evaluation 


Initiation  Stage 


Dissemination  Stage 


Stages  of  Program  Development 
Figure  1. — Focus  of  research  during  the  course  of  a  comprehensive  evaluation. 


vention  progress  with  intervention  objectives 
and  expectations.  If  the  goals  and  expectations 
are  not  being  met,  the  intervention  can  be 
adjusted. 

Perhaps  the  most  usehil  aspect  of  process 
evaluation  is  that  it  enables  evaluators  to  inter- 
pret or  understand  better  the  intervention's 
outcomes.  For  instance,  an  ordinance  may  be 
passed  mandating  that  all  alcohol  servers 
receive  tiaining  on  how  to  avoid  serving  alco- 
hol to  intoxicated  or  underage  customers.  If 
later  outcome  evaluation  determines  that  intox- 
icated and  underage  pations  were  still  able  to 
buy  alcohol,  process  evaluation  might  indicate 
why  the  intervention  did  not  have  the  intended 
effect.  Perhaps  not  enough  servers  actually 
received  the  tiaining.  Perhaps  turnover  among 
staff  at  retail  outlets  is  so  great  that  the  training 
should  be  repeated  more  frequently.  Perhaps 
management  of  retail  establishments  did  not 
support  service  staff  in  their  efforts  to  avoid 
inappropriate  sales.  The  data  collected  through 
process  evaluation  can  provide  important  infor- 
mation for  explaining  the  causes  of  high  or  low 
performance  levels  (i.e.,  outcomes).  Methods 
for  collecting  data  to  evaluate  the  process  of 
a  prevention  intervention  are  discussed  in 
chapter  5. 


Outcome  Evaluation 

Often  when  prevention  administiators  state 
that  an  intervention  has  been  successful,  they 
cite  data  derived  from  process  evaluation  as 
evidence  of  this  success.  That  is,  they  cite  obser- 
vational data  on  practices,  participants'  reac- 
tions to  or  evaluations  of  services,  and  other 
types  of  feedback.  While  such  information  is 
useful,  it  is  important  to  move  beyond  this  type 
of  assessment.  Prevention  success  ultimately 
must  be  defined  by  outcomes. 

In  essence,  outcome  evaluations  attempt  to 
find  out  whether  a  prevention  effort  made  a  dif- 
ference in  the  lives  of  clients  or  in  the  communi- 
ty. Specifically,  outcome  evaluation  documents 
whether  a  prevention  intervention  produced 
the  desired  effects:  for  example,  changes  in 
behavior  or  in  the  number  of  problem  events. 
These  evaluations  also  look  for  unexpected 
outcomes  or  side  effects.  Sometimes,  outcome 
evaluations  even  ask  how  the  benefits  of  the 
intervention  compared  with  the  costs. 

Most  policy  interventions  and  many  indi- 
vidual-oriented interventions  have  intermediate 
and  long-term  goals.  In  the  long  term,  these 
interventions  are  intended  to  reduce  substance- 
related  health  and  social  problems  either  in 
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Interventions 
With  Individuals 

Acquisitions  of 
Knowledge  and/or  Skills 

Reductions  in 
Substance  Abuse 


Policy 
Interventions 

-^ 

Community 
Changes 

Reductions  in 
Substance  Abuse 


Reductions  in 
Substance 
Problems 


Intermediate  Outcomes 


Figure  2. — Evaluation  targets. 


Long-Term 
Outcomes 


individuals  or  in  a  community.  These  long-term 
outcomes  are  achieved,  however,  through  the 
accomplishment  of  intermediate  outcomes. 
Figure  2  illustrates  the  stages  by  which  preven- 
tion interventions  alter  behaviors  and  problems 
at  both  the  individual  and  community  levels. 

Examples  of  intermediate  outcome  evalua- 
tion questions  include: 

•  Did  the  intervention  change  attitudes 
among  the  targeted  population?  For  exam- 
ple, if  the  intervention  was  designed  to 
increase  community  support  for  a  specific 
poUcy  or  program,  did  public  support  actu- 
ally improve? 

•  Did  the  intervention  increase  skills  among 
the  targeted  population?  For  example,  if  the 
intervention  was  designed  to  increase  drug 
resistance  skills,  did  they  actually  increase? 

•  Was  the  environment  actually  changed  by 
the  intervention?  For  example,  if  the  inter- 
vention was  designed  to  reduce  youth 
access  to  a  substance,  did  access  actually 
decrease? 

•  Did  the  intervention  change  specific  target- 
ed behaviors?  For  example,  if  parents  were 
to  discuss  substance  abuse  issues  with  their 
children  more  frequently,  did  they  actually 
do  so? 

•  Did  rates  of  use  of  the  targeted  substance 
decline? 


Long-term  outcome  evaluation  questions 
may  include: 

•  Did  use  among  the  targeted  population 
decline? 

•  Did  rates  of  targeted  problems  decrease? 

-  Was  the  frequency  of  the  problem  lower 
after  the  intervention? 

-  Did  problems  decrease  more  in  commu- 
nities experiencing  the  intervention  than 
in  comparison  communities? 

-  Did  problems  decrease  more  in  certain 
populations  than  in  others? 

-  Were  reductions  in  problems  sustained,  or 
did  beneficial  effects  dissipate  over  time? 

The  answers  to  these  and  similar  questions 
can  help  improve  existing  interventions,  pro- 
vide the  motivation  to  sustain  successful  efforts, 
and  prompt  other  communities  to  implement 
similar  strategies. 

The  selection  of  appropriate  intermediate 
variables  is  important  and  requires  a  thorough 
understanding  of  the  causation  of  substance- 
related  problems  and  the  ways  in  which  strate- 
gies work  to  reduce  problems.  For  instance,  a 
family-oriented  prevention  program  may 
attempt  to  prevent  later  substance  abuse  among 
youth  by  improving  parenting  skiUs  of  parents 
of  small  children.  The  intermediate  outcome  is 
changes  in  family  management;  the  long-term 
outcome  is  reduced  substance  abuse  by  children. 
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In  another  example,  reduction  of  the  legal 
blood  alcohol  level  for  driving  is  one  policy 
approach  to  reducing  alcohol-related  traffic 
crashes.  An  apparently  logical  intermediate  out- 
come would  be  an  increase  in  the  number  of 
arrests  made  at  the  lower  blood  alcohol  level.  In 
fact,  few  such  arrests  are  likely  to  be  found. 
Even  after  the  law  is  changed,  police  tend  to 
arrest  drivers  with  very  high  blood  alcohol  lev- 
els because  they  are  easier  to  detect.  Nonethe- 
less, the  lowered  level  does  have  the  effect  of 
reducing  alcohol-related  crashes — for  example, 
by  making  people  feel  that  there  is  a  greater 
threat  of  arrest  when  driving  with  any  amount 
of  alcohol.  Consequently,  people  are  less 
inclined  to  drink  and  drive.  Thus,  it  is  impor- 
tant when  designing  any  evaluation  to  think 
through  carefully  the  chain  of  events  that  the 
intervention  is  likely  to  cause.  Understanding 
this  "causal  model"  will  help  ensure  that  the 
appropriate  outcomes  will  be  considered  at 
each  step  of  the  evaluation. 

If  the  intermediate  outcome  is  not  in  fact 
causally  related  to  the  long-term  outcome,  the 
results  of  the  evaluation  will  not  actually  pre- 
dict prevention  success.  For  example,  many 
prevention  programs  have  used  increased  self- 
esteem  as  an  outcome  measure,  even  though 
there  is  no  correlation  between  self-esteem  and 
substance  abuse  (Schroeder,  Laflin,  &  Weis, 
1993).  E valuators  and  program  planners  should 
carefully  examine  their  assumptions  about  the 
relationship  between  the  intermediate  and  long- 
term  outcomes  and,  whenever  possible,  include 
actual  measures  of  substance  use  and  problems 
as  part  of  the  evaluation.  This  usually  means  a 
reduction  in  a  specific  problem  or  decreased 
risk  of  a  problem.  For  example,  if  the  target  out- 
come is  self-reported  drug  use  by  young  peo- 
ple, then  such  drug  use  should  go  down.  In 
some  cases,  the  substance-related  outcome  itself 
can  be  measured.  For  example,  changes  in  the 
rate  of  alcohol-related  traffic  crashes  can  occur 
fairly  quickly  following  policy  interventions, 
making  it  possible  to  determine  the  effective- 
ness of  policy  interventions  with  this  focus. 

Evaluating  both  intermediate  and  long-term 
outcomes  may  at  first  appear  to  be  excessive. 
After  all,  we  want  to  reduce  substance-related 
problems,  so  why  not  just  measure  rates  of 


these  problems?  Obviously  the  ultimate  goal  is 
to  assess  the  effect  of  an  intervention  on  health 
and  social  problems,  but  measuring  the  effects 
of  strategies  on  problems  may  be  difficult.  In 
some  cases,  it  may  take  many  years  before  ben- 
efits emerge.  For  example,  the  life-saving  bene- 
fits of  preventing  the  sale  of  tobacco  to  minors 
will  not  emerge  for  decades.  In  some  cases,  the 
shifts  in  State  and  community  norms  and  val- 
ues prompted  by  policy  strategies  will  take  time 
to  develop.  By  measuring  intermediate  out- 
comes, the  relative  effectiveness  of  interventions 
can  be  forecasted  and  determined  more  quickly. 
Also,  if  an  intervention  is  not  successful,  evalua- 
tion of  intermediate  outcomes  helps  to  identify 
where  it  can  be  strengthened. 

Attribution  of  Effect 

The  purpose  of  any  evaluation  is  to  determine 
whether  an  intervention  was  effective  according 
to  stated  goals  and  as  measured  by  vaUd  out- 
come criteria.  Effectiveness  can  be  measured  by 
(1)  comparing  postintervention  conditions  (out- 
comes) with  what  would  have  happened  had 
the  intervention  not  taken  place,  or  (2)  compar- 
ing one  set  of  postintervention  conditions  with 
those  of  another  intervention. 

The  problem  in  evaluating  an  intervention 
in  its  own  right  is  that  it  is  not  possible  to 
observe  what  would  have  happened  to  a  group 
of  people  without  the  intervention  or  if  an  alter- 
native intervention  had  been  implemented. 
Analysts  must  identify  and  develop  methods  of 
predicting  what  would  have  occurred  if  a  spe- 
cific intervention  had  not  taken  place.  The  pop- 
ulation that  received  the  intervention  may  be 
compared  with  a  similar  population  or  group 
that  did  not  receive  the  intervention,  which  acts 
as  the  control.  For  example,  a  community  part- 
nership targets  four  public  housing  sites,  situat- 
ed in  one  ward  of  a  large  city,  for  intensive 
delivery  of  its  youth-  and  family-focused  pre- 
vention programs.  Public  housing  sites  that  are 
similar  in  terms  of  living  conditions  and  resi- 
dent populations  to  those  targeted  for  services 
could  serve  as  comparison  sites.  By  comparing 
conditions  between  the  two  sites  before  and 
after  the  intervention,  the  amount  of  change 
that  interventions  produce  in  substance  abuse 
indicators  or  other  desired  outcomes  can  be 
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measured.  Thus,  systematically  varying  inter- 
ventions across  individuals,  sites,  or  time  can 
provide  information  about  the  strength  of  rela- 
tionships between  interventions  and  outcomes. 

Ideally,  the  intervention  is  designed  as  an 
"experiment."  True  experiments  require  that 
the  units  of  analysis  (for  example,  community 
residents  or  chosen  communities)  be  randomly 
selected  to  participate  in  the  intervention.  Ran- 
dom assignment  occurs  when  each  member  of 
a  pool  of  potential  participants  has  an  equal 
chance  of  being  assigned  to  the  intervention 
group  and  the  control  group.  The  advantage  of 
a  true  experimental  design  is  that  random 
assignment  ensures  that  the  intervention  and 
control  groups  do  not  differ  from  each  other  in 
any  systematic  way.  If  differences  are  found 
between  the  two  groups  using  statistical 
methods,  they  are  most  likely  due  to  the 
intervention. 

When  a  prevention  effort  is  directed  at  indi- 
viduals, it  is  often  possible  to  assign  partici- 
pants randomly  in  a  study.  Some  evaluations 
randomly  assign  classrooms,  schools,  neighbor- 
hoods, or  even  communities  to  experimental  or 
control  groups.  Often,  however,  random  assign- 
ment may  be  too  expensive,  impractical,  or 
even  inappropriate. 

Because  true  experiments  often  are  not  pos- 
sible or  are  too  expensive,  prevention  evalua- 
tions must  sometimes  rely  on  what  is  known  as 
"quasi-experimental  design"  to  isolate  the  rela- 
tionship between  interventions  and  outcomes. 
In  the  absence  of  random  assignment,  the  abili- 
ty to  analyze  linkages  between  interventions 
and  outcomes  depends  on  the  ability  to  isolate 
their  interrelationships.  For  instance,  adoles- 
cents whose  parents  are  extremely  supportive 
of  high  academic  achievement  might  be  more 
likely  than  others  to  comply  with  school-based 
programs  of  all  sorts,  including  substance  abuse 
prevention,  and  might  also  be  less  at  risk  for 
substance  use.  In  this  and  other  instances  of 
apparent  intervention-to-outcome  linkages 
there  may  be  no  real  relationship  between  par- 
ticipation in  program  interventions  and  out- 
comes. Individuals  who  participate  more  in 
school  programs  might  be  found  to  have  more 
favorable  outcomes,  such  as  lower  indices  of 
substance  use  risk.  That  may  not  mean  that  the 


intervention  caused  the  outcome.  The  relation- 
ship could  be  due  to  unmeasured  factors  that 
influence  both  the  participation  in  interventions 
and  the  attainment  of  outcomes. 

Considering  the  role  of  factors  other  than 
the  intervention  is  also  important  in  evaluating 
interventions  conducted  at  the  community  or 
State  level.  For  example,  the  number  of  alcohol- 
related  car  crashes  may  decline  following  a 
community  prevention  intervention.  The 
decline  in  crashes  may,  however,  be  due  to  an 
increase  in  gasoline  prices,  a  new  State  law 
increasing  penalties  for  driving  under  the  influ- 
ence, or  a  recession  that  reduces  personal 
incomes.  Indeed,  the  change  in  auto  crashes 
may  be  part  of  a  downward  tiend  that  preceded 
the  prevention  program.  The  challenge  of  out- 
come evaluation  is  to  sort  out  the  effects  of 
other  factors  so  that  any  changes  in  behavior 
can  be  accurately  attributed  to  the  prevention 
effort  itself. 

Researchers  use  the  term  "internal  validity" 
to  indicate  that  the  intervention  being  tested  has 
an  effect  independent  of  other  factors  that 
might  result  in  a  similar  outcome  (Campbell  & 
Stanley,  1963).  Table  4  highlights  some  of  the 
problems  encountered  when  researchers  try  to 
isolate  the  effects  of  an  intervention  from  extra- 
neous factors  that  may  affect  the  desired  out- 
come. These  problems  include  history, 
maturation,  measurement  effects,  statistical 
regression,  and  selection.  It  is  generally  not  pos- 
sible to  eliminate  all  of  these  problems,  but 
attempts  are  made  to  minimize  and  measure 
their  impact. 

Summary 

This  chapter  briefly  outlined  the  differences 
between  process  and  outcome  evaluations  and 
between  intermediate  and  long-term  outcomes. 
It  then  discussed  the  problem  of  accurately 
attiibuting  effects  to  prevention  interventions. 
A  credible  evaluation  plan  will  acknowledge 
the  concepts  presented  in  this  chapter  by  indi- 
cating how  common  evaluation  problems  will 
be  addressed. 
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Table  4.  Threats  to  Internal  Validity 


Threat  to 
Internal  Validity 

Definition 

Example 

History 

Events  or  conditions  other  than  the 
intervention  may  have  an  impact  on 
the  experimental  group  and  affect 
outcome  measures  independent  of 
the  intervention. 

If  a  designated  driver  program  is 
implemented,  other  factors,  such  as 
a  change  in  the  price  of  gasoline 
or  alcohol,  or  enforcement  of 
stricter  legal  penalties  for  driving 
while  intoxicated  (DWI),  could 
reduce  DWI  and  related  accidents, 
independent  of  the  designated 
driver  program. 

Maturation 
Effects 

Changes  in  the  experimental  group 
unrelated  to  the  intervention  may 
influence  outcome  measures. 

Children  tend  to  use  more 
substances  as  they  get  older, 
independent  of  external 
circumstances. 

Measurement 
Effects 

The  process  of  collecting  information 
may  influence  responses  on  the 
measures  being  used  to  assess 
outcomes.  While  completing  a 
survey  or  test,  the  respondent's 
knowledge  or  attitudes  may  be 
altered,  and  subsequent  responses 
may  change  solely  as  a  result  of 
having  been  surveyed,  observed,  or 
interviewed.  Problems  may  also 
occur  with  archival  data  because  of 
changes  in  reporting  rather  than 
actual  changes  in  incidence. 

Questions  about  attitudes  toward 
alcohol  or  drugs  on  a 
preintervention  survey  may 
make  respondents  reflect  on  their 
alcohol  and  drug  attitudes  and 
change  their  responses  to  more 
socially  acceptable  ones  on  a 
postintervention  survey 

Regression  to 
the  Mean 

When  a  value  is  extreme  or  out  of  the 
ordinary  subsequent  values  tend  to 
be  closer  to  or  regress  to  the  mean. 

If  there  is  a  rash  of  substance 
abuse-related  incidents  in  a  year, 
there  are  likely  to  be  fewer 
substance  abuse-related  incidents 
in  subsequent  years,  with  or 
without  the  intervention. 

Selection 
Effects 

An  apparent  experimental  effect 
occurs  because  of  an  inequality 
between  the  treatment  and  control 
groups.  Differences  between  the 
experimental  and  control  groups 
after  an  intervention  may  result 
from  unobserved  initial  differences 
between  the  groups,  rather  than  as 
a  result  of  the  intervention. 

A  school-based  substance  abuse 
prevention  program  was  found 
to  be  very  effective  in  that 
participants  reported  less  drug 
use  than  nonparticipants.  Students 
who  already  had  negative  attitudes 
toward  drug  use,  however,  were 
more  likely  to  participate  in  the 
program. 

18 


Chapter  3 


Outcome  Evaluation  Designs 


In  this  chapter  two  types  of  research  designs 
for  outcome  evaluations  are  discussed:  experi- 
mental and  quasi-experimental.  Neither  type  of 
research  design  is  better  or  more  scientific  than  the 
other  design  type.  Whether  a  research  design  can 
produce  credible  results  depends  on  how  well 
suited  the  research  design  is  to  the  intervention 
and  how  the  research  design  is  actually  imple- 
mented within  the  context  of  the  evaluation. 
The  books  listed  in  the  appendix  contain  length- 
ier discussions  of  issues  relating  to  evaluation 
design. 

Experimental  Designs 

Experimental  designs  always  include  random 
assignment  to  experimental  or  control  condi- 
tions to  assist  with  attribution  of  effect.  Differ- 
ent experimental  designs  are  useful  in  different 
situations.  Several  alternatives  are  described 
below. 

Pretest-Posttest  Control  Group  Design 

In  a  pretest-posttest  control  group  design, 
participants  are  randomly  assigned  to  either  a 
treatment  group  or  a  control  group.  As  can  be 
seen  in  figure  3,  tfie  two  groups  are  twice 
assessed  at  the  same  time.  Only  one  group, 
however,  is  exposed  to  an  intervention  in  the 
period  between  the  two  assessments. 

By  way  of  example,  Quinn  County  Preven- 
tion Coalition  operates  two  innovative  pro- 
grams designed  to  encourage  a  healthy  lifestyle 
for  youths  and  to  discourage  drug  use,  and  it 
wants  to  find  out  which  program  has  a  greater 
effect  on  youths'  behavior.  Using  a  pretest- 
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Group  1:  Measurement 


Program ►  Measurement 


(Randomized  Assignment 
to  Group  1  or  Group  2) 


\ 


Group  2:  Measvirement 


-►  Measurement 


Figure  3. — Experimental  design:  simple  pretest-posttest  control  group. 


Group  1: 


Measurement  ►  Program  1  ►  Measurement 


(Randomized  Assignment 
to  Group  1,  2,  or  3) 


Measurement  ►  Program  2  ►  Measurement 


Group  3:  Measurement 


-►  Measurement 


Figure  4. — Experimental  design:  multiple  group  pretest-posttest  control  group. 


posttest  control  group  design,  sophomores  at 
Washington  High  School  are  randomly 
assigned  to  three  health  classes,  one  for  each 
program  and  one  as  a  control  group. 

Before  beginning  the  programs,  all  stu- 
dents are  administered  a  pretest  that  measures 
lifestyle  practices  and  drug  use.  In  one  class, 
the  instructor  teaches  the  innovative  health 
prevention  curriculum  (intervention  Group  1); 
in  a  second  class,  10  "superstars"  make  1-hour 
presentations  (intervention  Group  2);  and  in 
a  third  class,  the  teacher  conducts  the  tradition- 
al health  class  (Control  Group).  At  the  end  of 
the  5-week  period,  all  students  are  adminis- 
tered a  posttest  that  measures  the  attitudes 
and  behaviors  targeted  for  change.  Test  scores 
are  then  compared  across  classes  to  determine 
effectiveness. 


Figure  4  presents  the  Washington  High 
School  example.  This  design  enables  the  evalua- 
tors  to  conclude  that  whatever  differences  are 
measured  among  the  groups,  they  are  most 
likely  due  to  differences  among  the  programs 
the  students  received.  Ideally,  of  course,  a 
longer  term  f oUowup  would  be  desirable  to 
determine  if  changes  persist  over  time. 

Posttest-Only  Control  Group  Design 

In  this  type  of  experimental  design,  participants 
are  randomly  assigned  to  either  the  interven- 
tion group  or  the  control  group.  The  interven- 
tion group  is  given  both  a  pretest  and  a  posttest, 
and  the  control  group  is  given  only  a  posttest. 
Some  evaluators  believe  that  this  type  of  experi- 
mental design  controls  for  any  confusing  results 
that  may  be  produced  by  a  control  group 
pretest.  For  example,  in  the  absence  of  exposure 
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Group  1:  Measurement 


Program 


-*•  Measurement 


(Randomized  Assignment 
to  Group  1  or  Group  2) 


Group  2: 
Figure  5. — Experimental  design:  posttest-only  control  group. 


Measurement 


Group  1:  Measurement 


Program ►  Measurement 


t 


(Plandomized  Assignment 
to  Group  1  or  Group  2) 


I 


Group  2:  Measurement 

Figure  6. — Experimental  design:  delayed  treatment 


->■  Program 


to  an  intervention,  control  group  test-takers 
may  nevertheless  score  quite  differently  on  a 
posttest  simply  as  a  result  of  having  taken  the 
test  the  first  time. 

Figure  5  depicts  this  type  of  true  experi- 
mental design.  In  this  case,  the  evaluators 
assume  that  the  experimental  and  control 
groups  were  similar  before  the  program,  but 
they  do  not  have  test  scores  to  prove  that 
assumption. 

Delayed  Treatment  Design 

When  there  is  good  reason  to  believe  that  a  pro- 
gram is  beneficial  and  has  no  harmful  side 
effects,  it  is  difficult  to  deny  services  to  partici- 
pants assigned  to  the  control  group.  The 
delayed  treatment  design  is  planned  so  that 
participants  are  randomly  assigned  to  receive 
the  prevention  program  services  at  different 
times.  For  example,  the  new  5-week  health  cur- 
riculum at  Washington  High  School  begins  on 


October  1  in  Class  A  and  starts  5  weeks  later  in 
Class  B.  At  the  end  of  the  first  week  in  Novem- 
ber, measurements  for  Class  A  and  Class  B  can 
be  compared  to  assess  the  effects  of  the  curricu- 
lum because  it  has  not  yet  started  in  the  second 
class.  However,  once  the  program  starts  in  the 
second  class.  Class  B  can  no  longer  serve  as  a 
control  group  for  Class  A.  Figure  6  portrays  an 
example  of  this  type  of  true  experimental 
design. 

Quasi-Experimental  Designs 

hx  many  situations,  an  experimental  design  is 
inappropriate,  too  costly,  or  impractical — for 
example,  when  an  intervention  is  applied  to  a 
whole  community.  While  there  is  a  wide  range 
of  quasi-experimental  designs,  most  are  varia- 
tions of  two  basic  approaches.  The  first 
approach  involves  the  use  of  a  comparison 
group  that  was  not  exposed  to  the  intervention. 
The  second  approach  uses  the  experimental 
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Group  1: 


Measurement 


Program 


Measurement 


Group  2:  Measurement 


-►  Measurement 


Figure  7. — Quasi-experimental  design:  nonequivalent  comparison  group. 


group  as  its  own  control.  Thus,  quasi-experi- 
mental designs  do  not  rely  on  random  assign- 
ment but  employ  other  research  techniques,  as 
well  as  statistical  analyses,  to  assist  with  attribu- 
tion of  effect.  Three  types  of  quasi-experimental 
designs,  discussed  below,  are  commonly  used. 

Nonequivalent  Comparison  Group 

This  type  of  design  compares  the  outcome  of 
the  group  that  receives  the  program  with  that  of 
a  comparison  group  that  does  not  receive  the 
program.  A  comparison  group  is  selected  be- 
cause it  appears  to  be  similar  in  many  respects 
to  the  group  receiving  the  program,  but  assign- 
ment to  the  experimental  group  versus  the  com- 
parison group  is  not  random.  Statistical  analysis 
is  used  to  control  for  any  preexisting  differences 
between  the  two  groups.  For  example,  a  com- 
munity might  wish  to  test  the  effectiveness  of  a 
mandatory  responsible  beverage  service  pro- 
gram for  bars  and  restaurants  as  a  means  of 
reducing  impaired  driving.  Following  the 
implementation  of  the  program,  evaluators 
could  compare  the  occurrence  of  impaired  dri- 
ving crashes  in  the  community  with  that  of  a 
similar  community  that  did  not  have  such  a 
program. 

In  this  case,  it  is  important  to  select  a  com- 
munity as  similar  as  possible  to  the  experimen- 
tal convmunity  based  on  certain  preselected 
criteria  (e.g.,  race,  socioeconomic  status), 
although  a  pretest  allows  evaluators  to  control 
statistically  for  some  initial  differences  between 
groups.  If  the  communities  differ  in  the  initial 
drinking  and  driving  rates,  statistical  adjust- 
ments for  these  preintervention  differences  can 
be  made  to  ensure  an  accurate  comparison  of 
the  communities  at  the  posttest.  In  order  for  the 
pretest  to  be  used  in  this  manner,  it  must  con- 
tain measures  of  all  relevant  dimensions  along 


which  the  intervention  and  comparison  groups 
differ. 

A  weakness  of  this  quasi-experimental 
design  is  that  it  is  difficult  to  be  certain  that  all 
relevant  characteristics  have  been  measured 
well  enough  to  control  statistically  for  all  plau- 
sible causes  of  posttest  differences  other  than 
the  program  itself.  In  general,  the  lack  of  ran- 
domized assignment  in  this  type  of  design 
allows  for  greater  threats  to  causal  inferences. 
Figure  7  presents  the  nonequivalent  comparison 
group  design. 

Time-Series  Design 

This  design  requires  multiple  measures  of  the 
desired  outcome  in  years  or  months  before  and 
after  the  intervention.  Measurements  taken 
before  and  after  are  then  compared  and  ana- 
lyzed for  trends  and  changes  in  tiends.  This 
type  of  research  design  allows  for  lags  in  the 
effects  of  the  intervention.  Archival  data  such  as 
impaired  driving  crash  and  arrest  rates  are  par- 
ticularly useful  for  this  type  of  analysis  because 
they  are  maintained  for  long  periods  of  time. 
Figure  8  depicts  this  design. 

A  time-series  design  would  not  establish 
causality  if  an  unplanned  event  coincided  with 
the  program.  For  example,  suppose  a  serious 
drunk  driving  crash,  in  which  a  prominent  citi- 
zen was  killed,  occurred  during  the  course  of 
the  evaluation.  It  would  be  hard  to  say  whether 
the  responsible  beverage  server  program  or  the 
emotional  event  and  its  publicity  resulted  in 
any  reduction  in  impaired  driving.  For  this  rea- 
son, time-series  evaluations  lead  to  stronger 
conclusions  when  a  program  is  introduced  into 
different  locations  in  different  years,  and  tiends 
over  time  are  examined  separately  for  each 
location. 
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Measurement  — ►  Measurement  — ►  Program  — ►  Measurement  — ►  Measurement 
Figure  8. — Quasi-experimental  design:  time  series. 


Group  1:    Measurement  — ►  Measurement  — ►  Program  — ►  Measurement  — ►  Measurement 


Group  2:    Measurement  — ►  Measurement  ■ 


■>  Measiu-ement  — ►  Measurement 


Figure  9. — Quasi-experimental  design:  time  series  with  comparison  group. 


Time  Series  With  Comparison  Group 

This  design  combines  the  use  of  a  comparison 
group  with  a  time-series  design.  Two  groups  or 
communities  are  compared  over  time  to  deter- 
mine whether  the  experimental  group  changes 
more  in  the  expected  direction  following  the 
introduction  of  an  intervention  than  a  similar 
comparison  group  that  does  not  receive  the 
intervention.  In  the  responsible  beverage  ser- 
vice example,  impaired  driving  crash  rates 
would  be  examined  over  time  to  detect  general 
trends.  The  rates  would  then  be  compared  fol- 
lowing the  implementation  of  the  responsible 
beverage  service  program  to  determine  whether 
the  crash  rates  decreased  in  the  experimental 
community  more  than  would  be  expected  from 
the  general  trend  and  more  than  in  the  compari- 
son community.  The  combination  of  time  series 


and  a  comparison  group  allows  a  more  confi- 
dent conclusion  that  the  program  caused  the 
observed  effect  because  it  controls  for  any  gen- 
eral trends  that  might  be  occurring  at  the  same 
time  as  the  program.  Figure  9  shows  a  time- 
series  design  with  a  comparison  group. 

Summary 

This  chapter  detailed  several  types  of  experi- 
mental and  quasi-experimental  research  designs 
that  could  be  used  in  outcome  evaluations.  It 
also  asserted  that  the  strength  of  the  research 
design  rests  in  its  appropriateness  to  the  inter- 
vention it  seeks  to  evaluate  and  its  ability  to 
attribute  causality  to  the  intervention.  For  more 
detailed  information  about  research  designs, 
consult  the  books  Usted  in  the  appendix. 
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Prerequisites  to  Data  Collection 


I        I  nee  a  research  design  to  evaluate  out- 
^^^  comes  has  been  chosen,  the  evaluator- 
practitioner  team  must  determine  what  data 
must  be  collected  to  answer  the  questions  posed 
in  the  research.  It  is  important  that  evaluators 
collect  only  data  that  can  help  answer  evalua- 
tion questions,  because  data  collection  is  often  a 
burden  on  staff;  it  often  interrupts  normal  work 
practices  and  sometimes  requires  staff  assis- 
tance. Regardless  of  the  types  of  data  needed, 
and  before  any  are  collected,  a  number  of  issues 
must  be  dealt  with. 

Evaluations  often  involve  original  data  col- 
lection (e.g.,  questionnaires,  interviews,  obser- 
vations) and  secondary  data  collection  from 
established  sources  through  archival  research. 
Both  types  of  data  collection  require  securing 
the  support  and  assistance  of  program  staff, 
gaining  access  to  records  and  people,  securing 
consent  to  participate,  ensuring  confidentiality, 
and  committing  to  provide  useful  feedback  on  a 
regular  and  timely  basis.  In  addition,  decisions 
must  be  made  about  how  to  choose  individuals 
for  inclusion  in  the  data  collection  effort  and 
how  to  cope  with  attrition.  Each  of  these  issues 
is  detailed  below. 

Securing  the  Support  and  Assistance 
of  Staff 

Securing  the  support  and  assistance  of  interven- 
tion staff  is  a  critical  first  step  for  both  qualita- 
tive and  quantitative  research.  It  is  part  of  the 
process  of  gaining  the  trust  of  staff  who  may 
perceive  evaluation  as  a  threat  to  the  program 
and,  hence,  to  their  job  security.  Thus,  people 
associated  with  the  project  (from  administrators 
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to  support  staff  and,  if  applicable,  clients)  should 
be  well  informed  about  the  evaluation's  pur- 
pose and  activities.  This  may  involve  asking  the 
evaluator  to  speak  during  a  staff  meeting  and  to 
provide  a  brief  written  description  of  what  the 
evaluation  is  designed  to  assess,  what  types  of 
data  collection  activities  will  be  conducted,  and 
the  evaluation's  projected  timeline.  Program 
administrators  and  evaluators  might  also  devel- 
op a  similar  statement  that  could  be  distributed 
to  clients,  with  clients  being  invited  to  direct 
any  further  questions  about  the  evaluation  to 
either  program  administrators  or  evaluators. 

Although  not  all  staff  may  have  been 
involved  in  earlier  collaboration  with  the  pro- 
gram evaluator,  staff  service  providers  (who 
often  tend  to  be  most  skeptical  of  evaluation 
research)  should  be  invited  to  review  the  evalu- 
ation design  and  methodology.  If  possible,  they 
should  also  be  asked  to  provide  suggestions  as 
to  how  the  methodology  and  planned  logistical 
arrangements  might  be  revised  to  ensure  that 
the  evaluation  wiU  be  conducted  in  a  way  that 
is  least  disruptive  to  the  day-to-day  operation  of 
the  program. 

If  staff  are  expected  to  assist  with  data 
collection,  it  is  important  to  assign  roles  and 
responsibilities  and  to  record  who  must  do 
what  by  when.  If  staff  must  collect  information 
regularly,  the  usefulness  of  this  information  to 
the  evaluation  should  be  explained  to  them. 
Feedback  on  the  progress  of  the  evaluation,  as 
well  as  preliminary  findings,  should  be  shared 
regularly  with  staff.  Staff  are  more  likely  to 
invest  themselves  in  the  evaluation  when  they 
get  regular  feedback  that  relates  directly  to  their 
roles  in  the  project. 

Gaining  Access  to  Data 

Approaching  an  organization  for  access  to  data 
that  have  already  been  collected  requires  a  spe- 
cific description  of  the  information  needed. 
Similarly,  approaching  an  organization  for 
access  to  its  clients  also  requires  a  detailed  state- 
ment. Many  organizations  have  a  formal  appli- 
cation procedure  that  must  be  followed  for 
access  to  data.  For  example,  schools  generally 
have  processes  for  administering  surveys,  and 
police  generally  have  procedures  for  releasing 
individual  records  for  research  purposes.  Other 


How  To  Gain  Access: 

•  Write  out  evaluation  questions  and 
information  needs. 

•  Find  out  about  formal  application 
processes. 

•  Concurrently,  use  informal  process  by 
cultivating  an  ally  or  group  of  insiders 
who  can  help. 

•  Maintain  access  with  frequent 
reporting. 


organizations  require  that  requests  for  access  be 
made  in  writing,  sometimes  supplemented  with 
details  about  how  the  information  will  be  used. 

Gaining  access  to  records  and  people  may 
take  several  months  or  longer;  evaluators  and 
prevention  administrators  should  begin  negoti- 
ations early  in  the  program  to  ensure  adequate 
access  to  information.  Personal  introductions  to 
recordkeepers — the  individuals  who  maintain 
and  are  most  familiar  with  the  data — can  be 
extremely  helpful  to  the  evaluators.  Early  on, 
the  evaluator  should  explain  to  the  recordkeep- 
ers what  type  of  information  is  needed  and  for 
what  purpose.  These  individuals  often  can 
point  out  useful  data  sources  not  already 
known  to  the  evaluator. 

Another  approach  is  to  convene  a  com- 
mittee of  representatives  of  different  groups 
involved  in  the  project.  Members  of  the 
group  may  have  the  data  needed  for  the 
evaluation,  or  they  can  assist  in  gaining  access 
to  the  information.  As  the  intervention  pro- 
ceeds, reporting  information  about  activities 
and  progress  to  collaborators  helps  maintain 
access  and  cooperation. 

Securing  Consent 

Evaluators  often  require  a  signed,  informed 
consent  document  from  participants.  This 
serves  four  purposes:  (1)  informing  participants 
of  the  goals  and  procedures  of  the  evaluation; 

(2)  ensuring  the  confidentiality  of  responses; 

(3)  informing  participants  of  their  right  to 
decline  participation  in  the  evaluation,  and  that 
declining  will  not  affect  their  receipt  of  program 
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How  To  Gain  Consent  and  Ensure 
Confidentiality: 

•  Consult  an  institutional  review  board 
(IRB)  that  can  assist  in  establishing 
safeguards.  Call  a  local  university  or 
research  firm  and  ask  to  speak  to  the 
chair  of  the  IRB.  Read  the  guidance  on 
participant  protection  provided  by  the 
Office  for  Protection  of  Participants  in 
Research. 

•  Train  all  data  collectors  to  uphold  the 
confidentiality  procedures.  Monitor 
them. 

•  Inform  participants  about  the  purpose 
of  the  evaluation  and  what  will  be 
asked  of  them.  Assure  them  that 
confidentiality  will  be  protected  and 
that  their  participation  is  voluntary. 

•  Design  consent  forms  and  obtain 
participants'  signatures. 

•  Plan  to  secure  data  through  coding 
names,  handling  sensitive  information 
carefully,  and  limiting  access. 

•  Ensure  that  reports  never  link 
individual  data  with  names. 


services;  and  (4)  describing  the  established 
procedures  to  safeguard  individual  confiden- 
tiality or  anonymity.  This  document  should  be 
made  available  to  the  participants  whenever 
requested.  If  children  are  involved,  it  is  often 
necessary  to  get  parents'  consent  in  addition  to 
the  children's  consent.  Participants  should 
always  have  an  opportunity  to  ask  questions 
about  the  evaluation,  either  in  person  or  by 
telephone. 

Ensuring  Confidentiality 

Substance  abuse  prevention  evaluations  often 
elicit  sensitive  information,  such  as  alcohol  and 
drug  use  or  criminal  behavior  One  potential 
risk  in  any  evaluation  is  the  release  of  confiden- 
tial information  that  could  result  in  harm  to  the 
participant.  Even  less  sensitive  information,  if 
publicly  released  or  carelessly  handled,  could 
embarrass  an  individual  and  erode  trust  in  the 


evaluators.  The  privacy  of  individuals  and  the 
confidentiality  of  information  must  be  main- 
tained in  all  evaluations.  Most  institutions  have 
institutional  review  boards  (IRBs)  to  help  guard 
the  interests  of  evaluation  participants.  They 
weigh  research  procedures  and  their  potential 
effects  on  individuals  against  the  value  of  the 
information  to  be  gained. 

Evaluators  and  prevention  administrators 
should  always  establish  formal  procedures  to 
collect,  store,  and  report  confidential  data.  Some 
procedures  are  listed  in  the  accompanying  box. 
One  recommended  practice  to  help  ensure  con- 
fidentiality is  to  use  identification  numbers 
rather  than  names  on  project  instruments. 
Another  is  to  conduct  interviews  in  a  private 
area  and  to  provide  privacy  when  respondents 
are  completing  surveys.  Evaluation  data,  espe- 
cially sensitive  information,  should  be  stored  in 
locked  facilities.  Evaluation  reports  should  not 
link  specific  data,  information,  or  responses 
with  individual  participants  or  their  families. 

Committing  To  Provide  Regular 
Feedback 

The  importance  of  providing  feedback  to  staff 
should  not  be  underestimated.  Evaluations 
often  interrupt  or  interfere  with  their  work,  and 
staff  frequently  are  not  given  feedback  about 
the  evaluation  progress  and  preliminary  find- 
ings. As  a  result,  staff  are  often  skeptical  of,  or 
even  hostile  toward,  evaluation.  Evaluation 
results,  even  if  only  preliminary,  can  be  of  great 
interest  to  staff.  Shariiig  this  information  can  do 
much  to  overcome  staff  resistance. 

Before  beginning  the  data  collection 
process,  evaluators  should  be  prepared  to  com- 
municate frequently  with  various  audiences, 
such  as  intervention  staff,  clients,  community 
groups,  and  the  press.  Early  feedback  will  prob- 
ably revolve  around  the  data  collection  process 
itself  (e.g.,  problems).  Later  on,  evaluators 
might  share  preliminary  findings  based  on  an 
initial  overview  of  a  body  of  data  as  it  is  collect- 
ed. Preliminary  and  more  conclusive  findings 
based  on  substantive  analysis  should  be  pre- 
sented in  the  form  of  brief  written  updates, 
because  program  practitioners  are  not  likely  to 
have  time  to  read  detailed  reports. 
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A  basic  update  summary  might  include  the 
following: 

•  A  brief  statement  of  the  intervention's  goals 
and  objectives  and  the  activity  or  activities 
designed  to  meet  them; 

•  A  summary  of  the  activity  or  activities  dis- 
cussed in  the  update; 

•  The  evaluation  questions  addressed  in  the 
update; 

•  A  brief  listing  of  research  methods 
employed; 

•  A  few  graphs,  charts,  or  bullet  listings 
showing  major  findings;  and 

•  A  statement  of  what  the  results  might  mean 
to  the  audience. 

Staff  are  an  important  first  audience  in  that 
they  can  act  as  a  reality  check  and  give  feedback 
on  the  accuracy  of  the  findings.  Staff  also  can 
use  findings  from  the  evaluation  to  improve 
activities.  It  is  important  to  note  that  negative 
findings  should  not  be  concealed;  rather,  they 
should  be  presented  and  explained.  Discussion 
of  this  sort  is  of  significant  value  to  a  program 
that  has  accepted  the  evaluation  as  necessary 
and  useful. 

Sampling  Strategies 

If  the  number  of  people  served  by  a  prevention 
program  is  relatively  small  (fewer  than  50),  col- 
lecting data  from  the  entire  population  can 
improve  the  credibility  of  the  evaluation.  How- 
ever, many  prevention  programs  cannot  afford 
to  study  all  the  people  they  serve.  For  example, 
programs  that  provide  prevention  curricula  to 
the  general  school  population  may  be  reaching 
tens  of  thousands  of  children  each  year.  With 
such  a  large  number  of  children  served,  the  pro- 
gram will  save  time  and  money  by  studying  a 
portion  of  the  population  served — a  sample. 

Determining  the  ideal  size  for  a  sample  can 
be  a  complex  process.  Sample  size  determines 
the  confidence  intervals  or  margin  of  error  in 
the  statistics  calculated  from  a  sample — the 
larger  the  sample,  the  smaller  the  expected 
error.  That  is,  the  larger  the  sample,  the  more 
accurate  (i.e.,  the  closer  to  the  value  that  would 
have  been  calculated  for  the  entire  population) 


the  statistics  calculated  for  the  total  group  will 
be,  and  the  more  powerful  the  study  will  be  in 
detecting  smaller  differences.  In  most  instances, 
programs  will  need  help  from  a  sampling 
expert  to  determine  the  best  sample  size. 

Evaluation  results  from  a  sample  may  be 
safely  generalized  to  an  entire  population,  but 
only  if  the  sample  is  of  a  sufficient  size  and  is  repre- 
sentative of  the  total  population.  For  example, 
suppose  a  prevention  program  provides  com- 
munity workshops  on  setting  family  rules  about 
drugs.  Program  staff  want  to  know  if  their  ser- 
vices change  parental  behavior.  To  answer  this 
question,  a  broad  cross-section  of  parents  from  a 
large  portion  of  workshop  locations  (a  represen- 
tative sample)  should  be  included  in  the  study. 
A  representative  sample  will  show  how  well 
parents  actually  establish  and  maintain  rules 
about  drugs  in  their  families  after  the  work- 
shops. A  nonrepresentative  sample  (e.g.,  fami- 
lies who  volunteer  for  an  evaluation  interview) 
will  provide  descriptive  information  about 
some  parents  but  could  give  a  biased  picture 
of  program  effectiveness  for  all  families. 

Two  probability  sampling  strategies  used 
by  experts  to  ensure  a  representative  sample  are 
simple  random  and  stratified  random  sampling. 
In  a  simple  random  sample,  each  object  (person, 
family,  block,  school,  etc.)  is  drawn  with  equal 
probability  and  independently  of  every  other 
object.  Stratified  sampling  also  involves  random 
sampling,  but  probabilities  of  selection  may  be 
different  for  different  groups,  called  strata.  With 
simple  random  sampling,  every  unit  in  the  pop- 
ulation of  interest  has  an  equal  chance  of  being 
included  in  the  sample.  With  stratified  random 
sampling,  the  population  is  first  divided  into 
strata  (e.g.,  by  ethnicity  or  gender);  then  simple 
random  samples  are  drawn  from  each  group  or 
stratum.  This  strategy  makes  it  possible  to  sam- 
ple smaller  groups  with  higher  priority  to 
ensure  their  adequate  representation.  Stratified 
sampling  makes  the  analysis  more  complex  and 
should  always  involve  the  guidance  of  a  person 
competent  in  this  practice. 

The  budget  for  evaluation  is  often  the  pri- 
mary determinant  of  sample  size.  Frequently, 
information  on  intermediate  effects  and  out- 
comes may  be  obtained  only  through  costly 
personal  interviews  or  survey  questionnaires. 
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In  determining  the  amount  of  information 
desired,  the  expense  of  increased  sampling 
should  be  considered.  There  is  generally  a 
tradeoff  between  choosing  a  narrow  and  less 
expensive  sample  and  a  broader,  more  expen- 
sive sample. 

Selecting  Participants  for  Qualitative 
Evaluation 

Selecting  participants  for  qualitative  research  is 
as  important  as  it  is  for  quantitative  sampling 
but  may  involve  different  procedures.  Samples 
selected  for  quantitative  studies  are  usually  ran- 
domly chosen.  However,  since  the  goal  of  quali- 
tative methods  is  to  get  the  most  information 
possible  from  smaller  numbers  of  respondents, 
participants  may  be  selected  purposefully, 
although  qualitative  evaluators  should  be  just 
as  concerned  about  choosing  a  representative 
sample  as  are  quantitative  evaluators. 

Qualitative  evaluators  can  consider  several 
strategies  for  sample  selection.  One  method 
involves  looking  at  typical  cases  only.  Another 
method  involves  including  extreme  (unusual  or 
special)  examples  to  make  for  maximum  varia- 
tion in  the  sample.  The  extremes  could  vary  by 
age,  role,  and  experiences,  and  be  compared  on 
commonalities  and  differences  among  them  or 
between  the  extreme  and  the  typical  examples. 
Critical  or  politically  relevant  cases  could  be 
studied  to  gain  indepth  information  on  specific 
situations.  Knowledge  of  the  population  and 
setting  through  firsthand  exposure  to  the  set- 
ting or  from  information  provided  by  program 
staff  can  help  in  selecting  qualitative  samples. 

At  times  it  may  not  be  possible  to  get  to 
know  a  setting  well  enough  to  select  cases 
based  on  certain  characteristics  or  differences. 
Snowball  sampling,  also  called  chain  sampling, 
could  be  used  in  this  situation.  An  evaluator 
starts  with  a  few  respondents  and  asks  each  of 
them  to  suggest  other  respondents  who  could 
provide  useful  information.  For  example,  sup- 
pose a  researcher  wanted  to  interview  high 
school  seniors  who  have  never  experimented 
with  alcohol  or  other  drugs.  He  or  she  might 
ask  a  guidance  counselor  to  suggest  a  few  stu- 
dents to  whom  this  description  seems  to  apply. 
These  students,  in  turn,  might  refer  their  friends 


to  the  interviewer.  The  hazard  in  snowball  sam- 
pling is  that  the  sample  may  be  homogeneous 
and  unrepresentative,  and  thus  be  biased  and 
ultimately  inefficient,  although  convenient. 
Therefore,  the  decision  to  use  snowball  sam- 
pling should  be  made  only  after  the  focus  of 
research  is  known. 

Another  sampling  strategy  is  called  conve- 
nience sampling.  Participants  are  selected  on 
the  basis  of  easy  access  (e.g.,  selecting  only  par- 
ticipants who  show  up  on  a  particular  day). 
This  method  should  be  avoided  except  in 
exploratory  studies  because  it  may  not  meet  the 
informational  and  analytical  needs  of  an  evalu- 
ation. The  findings  from  this  type  of  sample 
may  be  limited  in  applicability  to  other  partici- 
pants or  locations. 

Coping  With  Attrition 

Sometimes  people  in  a  study  cannot  be  found, 
or  information  cannot  be  collected  from  them. 
People  move  away,  drop  out  of  the  program, 
refuse  to  complete  a  questionnaire,  or  are  absent 
from  school  or  work  on  the  day  of  the  study. 
Losing  people  from  the  study,  called  attrition, 
can  affect  results  in  two  ways.  First,  if  attrition 
is  not  considered  when  determining  the  desired 
sample  size,  it  may  result  in  too  few  people  in 
the  study,  thus  jeopardizing  the  confidence  in 
the  results.  Second,  people  who  cannot  be  locat- 
ed often  share  certain  characteristics.  Their 
exclusion  jeopardizes  the  representativeness 
gained  by  random  sampling  or  random  assign- 
ment. For  example,  research  has  shown  that 
rates  of  alcohol  and  drug  abuse  are  higher 
among  people  who  move  frequently  than 
among  people  who  do  not.  Similarly,  students 
who  are  frequently  absent  from  school  are  more 
likely  to  abuse  alcohol  and  drugs  than  students 
who  attend  school  regularly. 

Sample  attrition,  although  potentially  dam- 
aging to  a  study,  can  usually  be  addressed  so 
the  study  is  only  partially  compromised.  One 
method  is  to  select  a  larger  sample  for  the  study 
than  is  needed.  If  some  participants  drop  out, 
the  sample  will  remain  large  enough  for  analy- 
sis. Another  technique  to  improve  retention 
rates  is  to  establish  a  tracking  system  to  keep  in 
contact  with  participants  over  the  life  of  the 
study.  Finally,  pretest  data,  available  for  all  par- 
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ticipants,  can  be  used  to  assess  the  potential 
damage  created  by  attrition.  If  attrition  results 
in  nonequivalent  intervention  and  control 
groups,  as  might  occur  if  the  highest  risk  con- 
trol group  members  are  unavailable  for  the 
posttest  but  the  highest  risk  intervention  group 
members  are  successfully  located,  soimd  prein- 
tervention  data  can  be  used  in  an  attempt  to 
control  statistically  for  some  of  the  differences. 

Summary 

This  chapter  focused  on  critical  practical  issues 
that  should  be  of  concern  to  evaluators  before 
they  begin  data  collection.  Whether  implement- 
ing quantitative  or  qualitative  data  collection, 
evaluators  should  be  concerned  with  establish- 
ing rapport  with  the  prevention  personnel  and 
participants  who  are  the  focus  of  evaluation. 
Negotiating  entry  with  these  groups  will  facili- 
tate cooperation  and  assistance  when  evaluators 


are  actually  collecting  data.  Taking  early  steps 
to  ensure  access  to  archival  data  was  also 
advised.  Frequently,  it  can  take  several  weeks  to 
receive  permission  and  retrieve  documents 
from  sources  other  than  the  program  being 
evaluated.  Evaluators  should  also  secure  a  for- 
mal consent  to  participate  in  the  study  from 
those  involved,  particularly  from  program 
clients.  Measures  to  guarantee  that  all  data  are 
kept  confidential  are  also  important.  Another 
prerequisite  is  making  a  commitment  to  pro- 
vide regular  feedback  to  intervention  staff 
throughout  data  collection  and  analysis.  Deci- 
sions about  sampling  or  selecting  participants 
for  data  collection  must  be  made  before  this 
process  begins,  and  a  strategy  for  coping  with 
attrition  must  be  adopted.  While  this  iiiforma- 
tion  may  not  be  new  to  the  experienced  evalua- 
tor,  it  should  be  reviewed  by  others  who  may 
not  be  familiar  with  conducting  an  evaluation. 
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Common  Data  Collection  Methods 


Prevention  programs  are  often  complex, 
with  mtiltiple  purposes,  sites,  and  soiirces 
of  data.  Combining  a  variety  of  data-gathering 
strategies  results  in  richer,  more  meaningful 
data.  This  chapter  begins  with  a  brief  discussion 
of  the  distinction  between  qualitative  and  quan- 
titative methods.  Both  quantitative  and  qualita- 
tive methods  and  data  can  be  used  to  assess 
either  intervention  processes  or  outcomes.  Table 
5  illustrates  how  this  is  possible.  There  does 
tend  to  be  a  shift  over  the  course  of  an  evalua- 
tion, however,  with  more  qualitative  measures 
used  early  for  process  evaluation  and  quantita- 
tive measures  used  later  for  outcome  evalua- 
tion. See  figure  10  for  a  depiction  of  how  this 
mixture  tends  to  change. 

Qualitative  Versus  Quantitative 
Methods  and  JVleasurement 

Qualitative  methods  result  in  descriptions  of 
problems,  behaviors,  or  events.  Qualitative 
measures  provide  the  stories  that  illustrate  the 
nature  of  the  problem  addressed;  the  processes 
by  which  those  problems  were  addressed;  the 
complex,  multifaceted  dimensions  of  success; 
and  the  meaning  of  substance-related  problem 
prevention  for  quality  of  life  in  a  healthy  com- 
muunity.  Qualitative  data  provide  more  details 
or  nuances  but  may  be  viewed  as  less  objective 
and  may  be  difficult  to  analyze  or  summarize 
and  compare  systematically. 

In  contrast,  quantitative  measurement  con- 
sists of  counts,  rates,  or  other  statistics  that 
document  the  actual  existence  or  absence  of 
problems,  behaviors,  or  occurrences.  Quantita- 
tive data  are  generally  considered  more  objec- 
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Table  5.  Measures  for  Process  and  Outcome  Evaluations 


Qualitative  Measures 

Quantitative  Measures 

Process 
Evaluation 

Focus  group  and  interview  data  on 

support  for  intervention 
Observational  data  on  mobilization 

efforts 
Interview  data  on  organizational 

climate  within  the  intervention 
Content  analysis  data  on 

intervention's  plan  to  produce 

desired  program  results 
Focus  group  and  interview  data 

on  the  nature  of  service  delivery 
Archival  data  on  the  nature  of  service 

delivery 
Observational  data  on  the  nature  of 

service  delivery  and  context 
Interview  data  from  program 

dropouts  on  causes  of  attrition 
Archival  data  on  publicity,  correspon- 
dence, recruitment  efforts, 

attendance  patterns,  expenditures 
Content  analysis  data  on  media 

coverage 

Questionnaire  data  on  perceived  need  for 

the  intervention 
Questionnaire  data  on  nature  and  extent 

of  problems 
Questionnaire  data  on  participants' 

perceptions  of  intervention's 

"organizational  climate" 
Archival  data  on  numbers  of  people  and 

constituencies  involved  in  the  intervention 
Questionnaire  data  on  the  nature  of  service 

delivery  or  policy  implementation 
Questionnaire  data  from  program  dropouts 

to  ascertain  causes  of  attrition 
Questionnaire  data  on  client  satisfaction 

with  program 
Counts  of  media  coverage,  enforcement 

activities,  attendance  at  activities 

Outcome 
Evaluation 

Focus  group  data  indicating  changes 

in  attitude  and  knowledge 
Interview  data  indicating  changes  in 

attitude  and  knowledge 
Observational  data  noting  changes  in 

the  nature  of  the  problem 
Archival  data  from  intervention 

records  concerning  perceived 

effects 
Archival  data  on  incidence  of  focal 

problem 

Questionnaire  data  from  participants  and/or 

a  control  or  comparison  group  from 

instruments  designed  to  detect  changes 

in  attitude,  knowledge,  and  behavior 
Followup  questionnaire  data  reflecting 

changes  in  attitude,  knowledge,  and 

behavior 
Questionnaire  data  on  client  perceptions  of 

program  effects 
Questionnaire  data  on  community  perceptions 

of  the  focal  problem  and  the  intervention 
Observational  data  noting  changes  in 

problem  incidence 
Archival  data  on  intervention's  monetary 

expenditures 
Hospital  and  police  records  on  incidence  of 

focal  problem  (e.g.,  injury,  death, 

vandalism) 
Counts  of  alcohol  and  tobacco  purchase 

attempts  from  licensed  establishments 
Police  enforcement  records 
Counts  of  alcohol  and  tobacco  nonmedia 

advertising 
Cost-benefit  data  on  monetary  expenditures 

and  documented  positive  effects 
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Stages  of  Program  Development 
Figure  10. — Research  methods  employed  in  comprehensive  program  evaluation. 


tive  and  easier  to  summarize  and  compare  than 
qualitative  data,  but  they  may  not  provide  all 
the  information  needed  for  interpretations  of 
findings. 

Qualitative  Methods 

Qualitative  methods  produce  findings  that 
show  patterns  or  inconsistencies  among  data. 
Frequently  employed  qualitative  data  collection 
methods  include  interviews,  focus  groups,  par- 
ticipant-observation, and  archival  research. 
Each  of  these  methods  is  described  below. 

Interviews 

Interviews  are  a  common  qualitative  data  col- 
lection technique.  There  are  three  standard 
interview  formats:  structured,  semistructured, 
and  unstructured.  Structured  interviews  (face- 
to-face  questionnaires)  are  generally  considered 
a  quantitative  method  and  are  discussed  later 
in  this  chapter.  Other  types  of  interviews  that 
allow  for  open-ended  responses  provide  a 
wealth  of  data  about  a  specific  topic  but  require 
additional  time  to  analyze.  Basically,  the  less 
structure  is  applied  in  the  interview,  the  more 
the  information  gained  will  depend  on  the  abil- 
ities of  the  interviewer.  Interviewers  must  be 
skilled  in  interviewing  techniques,  have  excel- 
lent interpersonal  skills,  and  be  sensitive  to 


cultural  differences.  Usually,  interviews  should 
occur  in  a  private  setting,  with  assurances  of 
anonymity  and  confidentiality. 

Semistructured  Interviews.  Semistructured  inter- 
views involve  asking  many  respondents  the 
same  series  of  questions;  however,  responses 
are  not  limited  to  a  given  set  of  answers.  Conse- 
quently, the  evaluator  is  assured  that  data  are 
collected  in  response  to  the  same  series  of  ques- 
tions from  a  number  of  individuals.  Both  semi- 
structured  and  unstructured  interviews  create 
an  interactive  situation  that  frequently  involves 
the  interviewer's  asking  additional  questions  to 
draw  out  detail.  Many  observers  believe  that 
semistructured  and  unstructured  interviews 
facilitate  more  candid  responses.  Because  the 
content  of  responses  may  range  tremendously, 
however,  significantly  more  time  may  be 
required  for  data  analysis. 

Unstructured  Interviews.  Unstructured  inter- 
views may  begin  with  a  series  of  questions  that 
the  evaluator  wants  answered,  but  the  inter- 
viewer tends  to  let  the  conversation  flow  natu- 
rally rather  than  being  constrained  by  a  set 
outline.  What  is  paramount  in  the  unstructured 
interview  is  that  the  respondent  feel  free  to 
answer  questions  in  his  or  her  own  words  and 
that  the  evaluator  be  prepared  to  ask  unantici- 
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pated  questions  based  on  information  con- 
tained in  responses  to  earlier  questions. 
Unstructured  interviews  differ  from  conversa- 
tions in  that  the  evaluator  is  primarily  con- 
cerned with  gathering  information  and  not 
with  contributing  to  conversation — that  is, 
extended  participation  by  the  evaluator  is 
only  a  means  of  clarifying  question  intent  and 
steering  the  interview. 

Unstructured  interviews  are  useful  when 
exploring  sensitive  issues,  emerging  events,  or 
unique  experiences.  They  are  also  common 
when  a  respondent  is  expected  to  provide 
unique  information.  Consequently,  the  un- 
structured interview  method  is  not  usually 
employed  when  an  evaluator  is  trying  to 
collect  information  about  the  same  topics 
from  several  persons. 

Focus  Groups 

Focus  groups  are  informal  discussion  sessions 
that  typically  involve  6  to  10  people.  Discus- 
sions center  around  specific  topics  and  usually 
last  between  1  and  2  hours.  Participants  with 
similar  specific  characteristics  are  invited  to 
attend.  For  example,  if  one  suspected  that  ado- 
lescent substance  abuse  was  particularly  perva- 
sive among  male  athletes  in  a  small  town  and 
wanted  to  know  more  about  this  phenomenon, 
one  could  conduct  a  focus  group  with  adoles- 
cent male  athletes  who  admit  to  engaging  in 
alcohol  or  drug  use.  Or,  one  could  conduct  a 
focus  group  with  junior  and  senior  high  school 
coaches  of  male  sports  teams  to  explore  why 
they  think  the  problem  exists.  Or,  one  could 
conduct  a  focus  group  with  parents  of  adoles- 
cent male  athletes  to  assess  their  perspectives 
on  the  problem. 

Facilitators  of  focus  groups  use  a  discussion 
guide  with  topics  or  questions  to  be  covered. 
The  groups  are  frequently  taped  with  permis- 
sion from  participants.  Generally,  a  skilled  facil- 
itator guides  the  group  interaction,  while  a 
colleague  takes  notes,  runs  the  tape  recorder, 
and  handles  other  logistics.  Focus  groups  differ 
from  interviews  in  that  the  elements  of  interac- 
tion and  discussion  among  respondents  are 
added.  Moreover,  it  is  believed  that  because 
participants  share  certain  characteristics,  they 
will  be  inclined  to  express  their  opinions  hon- 
estly during  the  discussion. 


Participant-Observation 

Participant-observation  differs  from  standard 
observation  in  that  the  evaluator  participates  in 
the  activity  being  observed.  The  evaluator  usu- 
ally takes  notes  on  the  progress  of  the  activity, 
the  physical  setting,  patterns  of  interaction  and 
decisionmaking,  and  responses  to  planned  and 
unplamied  occurrences.  As  a  participant,  the 
evaluator  also  records  his  or  her  own  reactions 
to  the  nature  of  the  activity,  noting,  for  example, 
if  the  activity  was  stimulating  or  tedious.  In 
programs  serving  marginalized  groups  (e.g., 
recovering  substance  abusers,  the  disabled,  the 
elderly),  observers  in  general  may  not  be  wel- 
come, but  participant-observers  who  resemble 
group  members  to  some  degree  may  be  wel- 
come. 

Archival  Research 

Archival  research  with  a  qualitative  focus 
involves  examining  written  records  in  order  to 
understand  a  program  better.  Meeting  minutes, 
journals,  logs,  program  and  agency  correspon- 
dence, and  other  historical  documents  inform 
evaluator s  about  program  operations.  These 
materials  also  generate  ideas  for  other  questions 
to  pursue  in  interviews  and  observations. 

Quantitative  Metlnods 

By  definition,  quantitative  methods  produce 
measurable  findings  that  are  expressed  in  num- 
bers, such  as  amounts,  ratios,  and  percentages. 
Frequently  employed  quantitative  data  collec- 
tion methods  include  a  variety  of  survey  ques- 
tionnaires, observation,  and  archival  research. 
These  methods  are  described  below. 

Questionnaires 

Questionnaires  are  lists  of  questions  that  ask 
about  a  range  of  behaviors  and /or  opinions. 
Questionnaires  generally  restrict  answers  to 
those  provided  on  the  form.  That  is,  little  or  no 
space  is  provided  for  respondents  to  answer 
questions  in  a  way  that  diverges  from  the 
response  range  conceived  by  the  question- 
naire's creator.  Questionnaires  can  be  designed 
for  a  specific  program,  intervention,  or  popula- 
tion. In  some  cases,  standard  questionnaires 
used  in  other  research  and  evaluation  efforts 
may  be  more  appropriate. 
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For  easy  statistical  analysis,  each  possible 
response  is  assigned  a  numerical  value.  In  turn, 
this  value  is  used  to  code  actual  responses;  thus, 
questionnaire  data  can  be  processed  and  ana- 
lyzed quickly. 

Ideally,  questionnaire  surveys  should  be 
brief,  while  still  collecting  the  necessary  data. 
A  new  instrument  should  be  pilot-tested — that 
is,  tested  for  clarity,  readability,  and  suitability 
with  a  smaller  but  similar  population — before 
it  is  administered  in  an  evaluation.  Based  on 
feedback  from  the  pilot  test,  revisions  to  the 
instrument  may  be  necessary.  Good  question- 
naires are  worded  in  an  accessible  language 
and  at  an  appropriate  reading  level.  They  are 
also  designed  and  implemented  in  a  culturally 
sensitive  fashion.  Questionnaires  are  usually 
administered  in  one  of  three  ways. 

Self-Administered  Questionnaires 

Self-administered  (or  written)  questionnaires 
require  that  the  respondent  complete  all  por- 
tions of  the  form  without  assistance.  Self- 
administered  questionnaires  can  be  distributed 
directly  by  the  evaluator  to  the  respondent  or 
indirectly  by  mail  or  other  third  party  (such  as  a 
classroom  teacher  or  an  employer).  When  ques- 
tionnaires are  distributed  directly,  more  com- 
pleted questionnaires  are  likely  to  be  returned. 

One  drawback  of  self-administered  ques- 
tionnaires is  that  not  all  respondents  may 
understand  or  interpret  the  questions  and  give 
responses  in  the  way  that  the  evaluator  intend- 
ed. Consequently,  the  questionnaire  may  pro- 
duce data  that  are  not  completely  accurate  or,  in 
some  cases,  are  contradictory.  Another  problem 
results  when  respondents  do  not  answer  all 
questions.  In  other  cases  respondents  may  not 
believe  that  any  of  the  given  responses  accu- 
rately or  even  remotely  express  their  particular 
viewpoint.  In  these  cases,  they  may  choose  more 
than  one  response  or  choose  not  to  respond  at 
all,  thereby  invalidating  their  response  to  the 
question  and  contributing  to  different  response 
rates  for  each  question  on  the  form.  Finally,  self- 
administered  questionnaires  are  subject  to  selec- 
tion bias  that  can  affect  findings.  For  example, 
those  who  liked  the  program  may  be  more  like- 
ly to  return  questionnaires  than  those  who  did 
not.  Mail  surveys  are  well  known  for  producing 
relatively  low  response  rates. 


Face-to-Face  Questionnaires  (Structured  Interviews) 

Face-to-face  questionnaires,  or  structured  inter- 
views, require  that  an  interviewer  ask  all  the 
questions  on  the  form  directly  to  the  respondent 
and  record  responses  on  the  questionnaire  form. 
Under  these  circumstances,  the  respondent  can 
often  ask  for  clarification  if  questions  are  con- 
fusing. Some  believe  that  the  one-on-one  nature 
of  structured  interviews  increases  the  likelihood 
that  all  responses  will  be  completed  because  the 
interviewer  is  not  likely  to  let  a  question  remain 
tinanswered.  Stiuctured  interviews  also  allow 
for  easier  comparisons  among  respondents  than 
semistiuctured  interviews  (discussed  above). 
On  the  other  hand,  respondents  may  be  less 
likely  to  answer  sensitive  questions  accurately 
in  a  face-to-face  setting.  Structured  interviews 
are  also  more  costly  to  implement  because  they 
require  the  use  and  training  of  interviewers. 

Telephone  Questionnaires 

Telephone  questionnaires  are  also  surveys  that 
require  the  use  of  an  interviewer,  and,  like  face- 
to-face  questionnaires,  the  involvement  of  this 
person  allows  for  clarification  of  questions  that 
seem  confusing  to  the  respondent.  Moreover, 
telephone  questionnaires  yield  nearly  as  high  a 
response  rate  as  structured  interviews  but  at  a 
much  lower  cost,  especially  if  the  respondents 
are  spread  over  a  large  geographic  area.  Confi- 
dentiality may  be  more  difficult  to  maintain  in 
telephone  interviews,  however,  because  the 
interviewer  cannot  completely  control  for 
privacy. 

Observation 

Observation  is  another  method  employed  by 
evaluators  who  wish  to  collect  data  in  a  natural 
setting.  Observers  should  be  as  unobtrusive  as 
possible  as  they  record  behavior  using  checklist 
instruments  that  count  the  frequency  of  certain 
behaviors  or  interactions.  Checklists  are  rela- 
tively easy  to  complete.  The  observer  looks  for 
certain  predetermined  behaviors  and  checks 
them  off  when  they  occur.  For  example,  the 
observer  may  watch  to  see  if  salesclerks  ask 
for  age  identification  before  selling  tobacco  or 
alcohol  to  young  customers. 

Interactional  instruments  measure  the 
dynamics  of  human  interaction.  They  note 
behaviors  at  certain  times  and  record  patterns 
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of  interaction,  such  as  who  talked  to  whom.  An 
interactional  instrument  may  be  suitable  to 
measure  certain  prevention  objectives.  For 
example,  if  the  intent  of  a  program  is  to  increase 
friendship  attachment  cliques  in  a  middle 
school,  observers  could  record  the  patterns  of 
interactions  among  students  in  a  classroom. 

Regardless  of  the  data  source,  data  collec- 
tors must  be  selected  carefully  and  trained  to 
use  consistent  techniques  that  do  not  introduce 
bias  into  the  data.  Bias  can  occur  when  data  are 
not  collected  systematically  from  all  the  target- 
ed groups  or  individuals,  when  differences 
among  observers  produce  differences  in  data 
collected,  or  when  data  collection  itself  intro- 
duces systematic  influences  on  what  is  being 
observed. 

Archival  Research 

Archival  research  is  one  of  the  most  frequently 
used  and  least  expensive  data  collection  meth- 
ods. Information  that  is  normally  kept  as  part  of 
agency  or  organization  operations  (e.g.,  school 
records,  health  care  and  social  services  records, 
police  and  juvenile  court  records)  can  provide 
important  statistical  data  for  evaluations  and 
can  supplement  the  information  gathered  from 
other  data  sources. 

In  addition,  major  data  sets  indicating 
trends  in  specific  substance  abuse-related  prob- 
lems may  provide  information  relevant  to  the 
evaluation  or  useful  in  comparisons  of  local  sta- 
tistics with  county,  State,  and  National  statistics. 
State  and  local  alcohol  beverage  control  agen- 
cies systematically  collect  data  on  active  liquor 
licenses.  Through  its  Alcohol  Epidemiologic 
Data  System,  the  National  Institute  on  Alcohol 
Abuse  and  Alcoholism  maintains  data  on  alco- 
hol consumption  in  the  50  States  and  the  Dis- 
trict of  Columbia.  State  health  departments 
maintain  records  on  acute  hospital  discharges 
and  deaths  attributable  to  alcohol-,  drug-,  and 
tobacco-related  illnesses,  as  well  as  records  on 
substance  abuse-related  health  problems  in 
newborns.  State  departments  of  social  services 
often  contain  records  on  substance  abuse-relat- 
ed child  abuse  and  neglect.  The  National  High- 
way Traffic  Safety  Administration's  Fatal 
Accident  Reporting  System  tracks  alcohol-relat- 
ed traffic  fatalities.  SAMHSA's  Uniform  Facility 


Data  Set  (UFDS)  (formerly  NDATUS)  tracks 
treatment  facility  admissions  and  financing. 

Two  limitations  on  the  use  of  agency 
archival  data  for  evaluation  purposes  are  confi- 
dentiality (written,  informed  consent  of  partici- 
pants is  sometimes  necessary)  and  limited 
relevance  (needed  data  are  not  included  or  are 
incomplete  or  inconsistently  collected). 

Table  6  summarizes  the  data  collection 
methods  discussed  in  this  chapter,  noting  the 
strengths  and  weaknesses  of  each. 

Summary 

This  chapter  described  commonly  used  quanti- 
tative and  qualitative  data  collection  methods, 
noting  the  appropriate  uses  and  strengths  and 
weaknesses  of  each  method.  Both  quantitative 
and  qualitative  research  methods  should  be 
employed  to  assess  program  functioning  as  well 
as  program  outcomes.  Although  collecting  a 
wide  array  of  evaluative  information  may  at 
first  appear  daunting,  measurement  develop- 
ment and  data  collection  do  not  have  to  be  the 
work  of  just  one  person,  group,  or  agency. 
Rather,  the  evaluation  can  be  a  coordinated 
effort  of  individuals  and  organizations. 

Data  can  come  from  different  sources. 
Some  information  may  be  available  from 
record  archives.  For  example,  law  enforcement 
agencies  maintain  records  of  traffic  crashes, 
and  hospitals  maintain  records  of  the  occur- 
rence of  various  injuries  and  diseases  related 
to  substance  use.  Other  information  may  be 
collected  through  measurement  instruments 
that  are  already  routinely  administered  (e.g., 
school  surveys  administered  by  departments 
of  education). 

Data  collection  methods  can  also  comple- 
ment each  other.  For  example,  observation  of  a 
classroom  can  indicate  how  well  children 
appear  to  respond  to  the  program,  and  review 
of  their  school  records  can  help  determine  the 
impact  of  the  program  on  academic  perfor- 
mance. Good  evaluations  generally  employ  sev- 
eral types  of  data  collection  methods  so  that  a 
more  complete  and  accurate  overview  of  inter- 
vention operations  and  effectiveness  can  be  doc- 
umented and  defended  by  various  data  sources. 
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Table  6.  Common  Data  Collection  Methods 


Qualitative 

Method 

Strengths 

Weaknesses 

Semi- 
structured 
Interview 

Standardized  questions  and  steering 
of  the  interviewer  tend  to  ensure 
that  data  are  collected  in  response 
to  the  same  series  of  questions 
from  a  number  of  individuals. 

Open-ended  responses  result  in 
significantly  more  in-depth  data  than 
structured  interviews  with 
standardized  responses. 

The  somewhat  conversational  nature 
of  this  method  may  promote  more 
candid  responses. 

More  time  consuming  than  structured 

interviews. 
The  range  of  content  responses 

requires  considerably  more  time  for 

analysis  than  for  structured 

interviews. 
Poor  interviewing  or  interpersonal 

skills  of  the  evaluator  may  result  in 

the  introduction  of  bias  (e.g.,  leading 

respondents). 

Unstruc- 
tured 
Interview 

Open-ended  responses  result  in 
tremendously  detailed  data. 

Often  implemented  in  tandem  with 
participant-observation,  and 
because  of  the  highly 
conversational  nature  of  this 
method,  interviews  result  in  quite 
candid  responses  because  the 
respondent  feels  very  comfortable 
and  familiar  with  the 
interviewer/evaluator. 

Usually  quite  time  consuming  to 
conduct  and  to  analyze  data. 

Appropriate  only  when  seeking  critical 
and  detailed  information  that  can 
only  be,  or  best  be,  gathered  from 
one  source. 

Focus 
Group 

Extended  nature  of  discussion  lends 
itself  to  group's  exploration  of 
specified  topics  in-depth. 

Because  of  similar  characteristics 
among  group  members,  participants 
tend  to  share  opinions  and  feelings 
more  honestly. 

Analysis  of  discussion  content  as  well 
as  behavior  can  be  time  consuming. 

Participant- 
Observation 

Yields  considerably  more  detailed 
information  about  program  activity 
than  conventional  observation. 

Because  the  participant-observer 
looks  like  other  participants,  his/her 
presence  may  have  less  effect  on 
the  normal  behavior  of  other 
participants  than  would  the 
presence  of  an  outside  observer. 

Transcription  and  analysis  of  field 
notes  is  time  consuming. 

Among  marginalized  groups, 
participant-observers  who  do  not 
share  many  characteristics  of  the 
study  population  may  encounter 
difficulty  being  accepted. 
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Table  6.  Common  Data  Collection  Methods  (continued) 


Qualitative 

Method 

Strengths 

Weaknesses 

Archival 

Provides  historical  information  about 

Access  to  some  types  of 

Research 

tlie  study  subject. 

institutionalized  records  may  be 

One  of  the  least  costly  evaluation 

difficult  to  obtain. 

methods. 

Data  quality  may  be  variable. 

Information  may  generate  ideas  for 

better  questions  to  pursue  through 

interviews  and  to  explore  through 

observation/participant-observation. 

Quantitative 

Method 

Strengths 

Weaknesses 

Self- 
Administered 
Question- 
naire 

Collects  a  great  deal  of  information 
from  a  large  number  of  individuals 
in  a  standardized,  and  therefore 
easy-to-analyze,  way. 

May  be  implemented  in  a  relatively 
efficient  and  inexpensive  way. 

If  not  worded  well  or  at  an  appropriate 
reading  level,  will  cause  confusion 
in  the  respondent  and  possibly  lead 
to  erroneous  data. 

Respondents  may  not  answer  all 
questions. 

Frustrated  by  limited  response 
choices,  respondents  may  provide 
multiple  answers  to  questions  and 
nullify  their  responses. 

Questionnaires  distributed  by  mail  are 
subject  to  selection  bias  and 
generally  result  in  lower  response 
rates  than  for  other  types  of 
questionnaires. 

Face-to-Face 
Question- 
naire 

(Structured 
Interview) 

Collects  a  great  deal  of  information 
from  a  large  number  of  individuals 
in  a  standardized,  and  therefore 
easy-to-analyze,  way. 

Presence  of  interviewer  tends  to 
ensure  that  all  questions  are 
answered  in  the  desired  format. 

Presence  of  interviewer  allows  for 
clarification  of  question  meaning. 

Data  may  be  analyzed  relatively 
inexpensively. 

Respondents  may  be  less  inclined  to 
answer  sensitive  questions 
truthfully  in  a  face-to-face  format. 

More  costly  and  more  time 
consuming  than  self-administered 
questionnaires. 

Greater  risks  of  loss  of  confidentiality. 
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Table  6.  Common  Data  Collection  Methods  (continued) 


Quantitative 

IVIethod 

Strengths 

Weaknesses 

Telephone 
Question- 
naire 

Collects  a  great  deal  of  information 
from  a  large  number  of  individuals 
in  a  standardized,  and  therefore 
easy-to-analyze,  way. 

Involvement  of  interviewer  tends  to 
ensure  that  all  questions  are 
answered  in  the  desired  format. 

Physical  distance  between  interviewer 
and  respondent  may  increase 
likelihood  for  honest  responses  to 
sensitive  questions. 

Data  may  be  analyzed  relatively 
inexpensively. 

May  be  more  cost  efficient  than  other 
survey  strategies  if  a  large 
geographic  area  is  targeted. 

More  time  consuming  than  self- 
administered  questionnaires. 

Significantly  more  costly  than  mail 
surveys. 

Confidentiality  may  be  more  difficult  to 
maintain  because  the  interviewer 
cannot  control  for  privacy  on  the 
respondent's  end  of  the  telephone 
line. 

Observation 

Once  access  to  setting  has  been 

secured,  data  may  be  collected 

easily  and  unobtrusively. 
Counting  and  use  of  checklists  make 

data  analysis  relatively  easy  and 

inexpensive. 

Knowledge  of  observer's  presence 
may  alter  normal  actions  and 
behaviors  of  the  individuals  being 
observed. 

Observer's  presence  may  not  be 
welcome  by  activity's  participants. 

Archival 
Research 

Provides  systematically  collected 

historical  information  about  the 

study  subject. 
One  of  the  least  costly  evaluation 

methods. 
May  generate  other  questions  for  new 

ideas  to  pursue  about  the  study 

topic. 

Access  to  some  types  of  institutional 
records  may  be  difficult  to  obtain. 

Records  may  not  contain  relevant 
data,  or  data  may  be  incomplete  or 
inconsistent. 
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Chapter  6 


Concepts  in  Data  Analysis 


Data  analysis  is  a  process  that  involves  cat- 
egorizing, ordering,  manipulating,  and 
summarizing  data.  Throughout  this  process, 
evaluators  draw  preliminary  conclusions  from 
the  data  and  seek  verification  of  these  conclu- 
sions from  various  sources  within  the  database. 
This  chapter  discusses  basic  concepts  in  quanti- 
tative and  qualitative  data  analyses  that  fre- 
quently appear  in  evaluation  reports  and  with 
which  prevention  planners  should  be  familiar.  It 
also  addresses  which  types  of  data  analyses  are 
appropriate  at  each  stage  of  program  develop- 
ment. This  chapter  does  not  detail  specific  data 
analysis  methods  (e.g.,  regression  analysis,  path 
analysis).  Such  information  is  available  in  a 
variety  of  methods  handbooks  and  is  appropri- 
ate reading  for  evaluators  who  must  actually 
carry  out  data  analysis. 

Every  evaluation  plan  should  include  a 
data  analysis  plan  that  specifies  how  the  evalu- 
ator  intends  to  summarize  the  collected  data. 
There  are  two  types  of  quantitative  data  analy- 
sis that  should  be  included  in  any  comprehen- 
sive data  analysis  plan:  descriptive  analysis  and 
relational  analysis. 

Approaches  to  Quantitative  Data: 
Descriptive  Analysis 

Descriptive  analysis  summarizes  information 
in  order  to  provide  an  initial  picture  of  the  pre- 
vention program.  Descriptive  analysis  provides 
basic  information  about  the  variables  included 
in  the  study.  For  example,  descriptive  analysis 
may  try  to  answer  the  questions,  "How  many 
pregnant  women  in  our  prenatal  program 
reported  smoking  tobacco  in  the  year  preceding 
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their  pregnancy?"  or  "How  many  employees 
are  affected  by  a  new  smoke-free  workplace 
policy?" 

Counting  and  simple  arithmetic  are  suffi- 
cient for  carrying  out  descriptive  analysis.  The 
evaluator  of  an  individual-oriented  intervention 
can  determine  such  things  as  how  many  people 
were  included  in  the  study;  their  breakdown  by 
age,  sex,  or  any  other  characteristic  considered 
important  for  the  study;  the  types  of  interven- 
tions to  which  they  were  exposed;  how  often 
they  attended  sessions;  and  the  number  who 
reported  specific  behavior  relevant  to  the  study. 
The  evaluator  of  a  policy  intervention  might 
explore  how  many  people  were  exposed  to  the 
intervention,  how  long  the  intervention  lasted, 
and  how  consistently  the  intervention  was 
applied.  Once  this  information  is  gathered,  it 
can  be  presented  either  as  raw  numbers  or  as 
percentages  of  a  larger  number  or  the  larger 
study  subject  pool. 

Measures  of  Central  Tendency 

After  basic  descriptive  statistics  are  compiled, 
evaluators  may  wish  to  determine  central  ten- 
dency. In  evaluation  research,  central  tendency 
is  often  presented  in  terms  of  the  mean,  median, 
and  mode.  A  mean  is  simply  an  average,  and  it 
can  be  calculated  by  dividing  the  sum  of  values 
by  the  number  of  values.  A  median  is  the  point 
at  which  50  percent  of  the  values  fall  below  it 
and  50  percent  exceed  it.  The  median  is  often 
presented  along  with  the  mean  because  the 
mean  is  influenced  by  the  high  and  low 
extremes  of  a  range.  The  mode  refers  to  the 
value  most  often  given  by  respondents.  Table  7 
illustrates  the  mean,  median,  and  mode  when 
13  participants  in  a  prevention  workshop  were 
asked  to  rate  the  overall  course  on  a  scale  of 
1  to  10. 

Standard  Deviation 

It  is  often  useful  to  know  how  much  variation 
exists  among  the  scores.  For  example,  suppose 
the  evaluator  of  the  prevention  workshop  want- 
ed to  know  whether  participants  rated  the 
workshop  fairly  consistently  or  whether  there 


Table  7.  Sample  Rating  Scores  on  Scale 
of  1  (Extremely  Poor)  to  10  (Excellent) 


8 
9 
5 
8 
8 
10 
7 
8 
8 
7 
8 
7 
8 


Mean  =  7.8  (98/1 3) 


Median  =  8 


Mode  =  8 


was  a  great  range  of  opinion.  In  order  to  deter- 
mine the  degree  to  which  responses  were  con- 
sistent, it  is  necessary  to  calculate  the  standard 
deviation. 

Standard  deviation  refers  to  the  spread  of 
scores  away  from  the  mean  score,  noting  where 
the  bulk  of  responses  lie.  Applying  the  formula 
for  computing  standard  deviation  to  the  data 
from  the  prevention  workshop  example,  we 
learn  that  the  standard  deviation  value  is  1.17. 
Given  that  certain  statistical  assumptions  hold 
true,  this  means  that  68  percent  of  the  rating 
scores  fell  within  one  standard  deviation  above 
and  below  the  mean  or  between  6.63  and  8.97. 
Since  we  know  that  the  10-point  rating  scale 
used  to  evaluate  the  workshop  used  increments 
of  one,  we  can  also  say  that  68  percent  of  the 
respondents  gave  the  workshop  a  rating  of 
either  7  or  8.  In  light  of  the  mean  (7.8),  one  can 
conclude  that  the  data  contained  a  very  low 
standard  deviation  and  that  participants  rated 
the  workshop  fairly  consistently. 

Consistency  of  ratings  (a  low  standard 
deviation)  is  often  a  welcome  sight  to  evalua- 
tors, regardless  of  whether  ratings  are  consis- 
tently good  or  bad.  A  low  standard  deviation 
suggests  that  later  conclusions  based  on  com- 
plex statistical  (relational)  analysis  are  more 
easily  defendable. 
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Stages  of  Development  and 
Descriptive  Analysis 

The  simple  yet  informative  nature  of  descrip- 
tive analysis  suggests  that  it  is  best  used  as  a 
backdrop  to  relational  analysis  in  all  experi- 
mental and  many  quasi-experimental  research 
efforts.  It  is  also  quite  well  suited  to  quasi- 
experimental  research  designs  to  provide  gen- 
eral descriptive  information  about  a  program 
itself  and  not  its  effects.  Moreover,  in  light  of 
the  multifaceted  nature  of  the  most  complete 
evaluations,  descriptive  analysis  may  be  the 
primary  type  of  analysis  on  data  collected 
during  the  early  stages  of  an  intervention. 

During  the  Initiation  and  Planning  Stages, 
staff  are  likely  to  be  concerned  v^ith  determin- 
ing the  nature  and  extent  of  problems  and 
assessing  the  community's  level  of  interest  in 
perceived  problems  as  well  as  any  effort  to 
address  them.  Prevention  planners  may  also 
wish  to  know  the  perceived  causes  of  the  prob- 
lem as  well  as  the  perceived  obstacles  to  over- 
coming it.  This  type  of  information  can  be 
gathered  through  quantitative  and /or  qualita- 
tive research  instruments.  If  surveys  are 
employed,  descriptive  analysis  is  probably 
the  only  type  of  quantitative  analysis  that  one 
might  expect  to  perform  on  the  data.  This 
analysis  would  provide  information  about 
which  segments  of  the  community  are  con- 
cerned about  the  problem  (e.g.,  men,  women, 
parents,  business  people,  the  elderly)  and  about 
how  concern  varies  among  different  constituen- 
cies. It  could  also  provide  information  about 
how  particular  constituencies  interpret  causes 
and  solutions.  These  causes  and  solutions  could 
be  rank  ordered  according  to  the  frequency 
with  which  they  were  cited. 

During  a  Pilot-Testing  Stage,  evaluators 
may  wish  to  note  participants'  level  of  satisfac- 
tion with  the  intervention,  as  well  as  to  docu- 
ment its  actual  implementation.  Once  again, 
quantitative  or  qualitative  data  collection  tech- 
niques can  be  used.  If  the  evaluation  plan  calls 
for  a  survey  of  participants  or  observation  with 
the  use  of  checklists  and  other  instruments  to 
record  interaction,  descriptive  analysis  will  be 
necessary  to  determine  how  participants  evalu- 
ated the  program  (e.g.,  how  many  participants 
rated  the  intervention  unsatisfactory).  Descrip- 


tive analysis  will  also  indicate  whether  planned 
occurrences  and  desired  behaviors  actually 
occurred  and  how  often  they  occurred  within 
the  context  of  the  intervention. 

In  the  Implementation  and  Stabilization 
Stages,  tasks  and  research  questions  tend  to  be 
somewhat  similar  in  that  questions  regarding 
adequacy  of  the  intervention,  obstacles,  costs, 
and  effects  are  ongoing  concerns.  These  are  also 
stages  in  which  outcome-related  questions 
become  more  urgent.  Consequently,  descriptive 
analysis  becomes  essential  for  providing  a 
framework  for  understanding  the  results  of 
more  complex  statistical  analyses.  In  addition, 
descriptive  analysis  remains  necessary  to 
address  such  concerns  as  participant  satisfac- 
tion and  other  support  for  the  intervention. 

The  Dissemination  Stage  involves,  in  part, 
assessing  projected  needs.  It  may  also  involve 
expanding  the  intervention  to  either  implement 
new  interventions  or  institute  the  same  services 
in  a  new  area.  If  the  intervention  expands,  eval- 
uators wiU  return  to  some  of  the  questions 
addressed  in  the  Initiation  and  Planning  Stages 
and  employ  descriptive  analysis  accordingly. 

Approaches  to  Quantitative  Data: 
Relational  Analysis 

The  second  type  of  quantitative  analysis  that 
one  would  expect  to  find  in  a  data  analysis  plan 
is  relational  analysis.  Relational  analysis  allows 
one  to  understand  the  relationship  between  key 
variables  in  the  study  or  to  test  a  hypothesis.  It 
is  particularly  well  suited  to  outcome  evalua- 
tions. For  example,  relational  analysis  could 
determine  whether  requiring  a  beverage  server 
training  course  was  related  to  fewer  alcohol- 
related  traffic  accidents  in  a  given  community. 
Relational  analysis  involves  making  compar- 
isons between  people  who  were  subjected  to  the 
intervention  and  those  who  were  not;  in  other 
words,  analysis  must  be  based  on  data  from  an 
experimental  design  or  a  quasi-experimental 
design  that  controlled  for  spurious  effects  of 
variables  other  than  the  intervention. 

Statistical  Significance 

The  results  of  relational  analysis  are  presented 
in  the  form  of  statistics,  highlighting  what  are 
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Table  8.  Contingency  Table  (Example) 


Employees 

Attending 

Program  A 

N  =  160 

Employees 

Attending 

Program  B 

N  =  190 

TOTALS 

Employees  who 

requested 

information 

60 

(37.5%) 

25 
(13.2%) 

85 

(24.3%) 

Employees  who 
did  not  request 
information 

100 

(62.5%) 

165 
(86.8%) 

265 

(75.7%) 

TOTALS 

160 
(100%) 

190 
(100%) 

350 
(100%) 

termed  statistically  significant  results.  Statistical 
significance  is  another  way  of  saying  that  a  par- 
ticular relationship  between  variables  occurs  so 
frequently  in  the  data  that  the  relationship's 
existence  can  probably  not  be  attributed  to 
chance.  Statistical  significance  is  determined 
through  one  of  several  tests.  Generally  speak- 
ing, statistical  results  are  said  to  be  significant 
when  relationships  between  variables  reach 
the  .05  (or  in  some  cases  .01)  level — that  is, 
when  a  relationship  has  a  less  than  5-percent 
probability  of  happening  by  chance. 

When  considering  statistically  significant 
findings,  prevention  planners  should  be  aware 
of  certain  limitations  of  the  term.  It  simply 
points  out  that  a  statistically  significant  rela- 
tionship exists.  It  does  not  mean  that  an  inter- 
vention caused  the  relationship.  Such  a 
determination  depends  on  the  research  design 
and  more  complex  statistical  analyses.  In  addi- 
tion, statistical  significance  does  not  indicate 
that  the  finding  is  strong  enough  to  be  impor- 
tant to  program  planners  and  policymakers. 
This  determination  should  be  based  on  an 
analysis  of  the  costs  and  benefits  of  the  inter- 
vention, among  other  issues. 

Contingency  Tables 

A  relatively  simple  way  of  comparing  relation- 
ships between  or  among  variables  is  through 
the  creation  of  a  contingency  table.  A  contin- 


gency table  presents  the  number  and  percent- 
ages of  participants  in  at  least  two  groups 
according  to  at  least  two  variables.  It  is  useful 
when  variables  are  clear-cut  categories,  like  sex 
or  race,  and  not  when  they  can  be  rank  ordered, 
like  age. 

Table  8  is  a  contingency  table  for  partici- 
pants in  two  different  employee  assistance  pro- 
grams who  requested  foUowup  literature. 

This  table  shows  that  people  who  attended 
Program  A  requested  information  more  often 
than  people  who  attended  Program  B.  If  one 
wanted  to  determine  whether  the  difference 
between  the  two  groups  is  statistically  signifi- 
cant, one  would  have  to  employ  a  test  of  statis- 
tical significance. 

Means  Comparisons 

One  way  of  comparing  relationships  between 
variables  when  they  can  be  rank  ordered  is  by 
comparing  the  means  of  different  groups.  For 
example,  suppose  one  wanted  to  know  if  a 
4-week,  parent-oriented  substance  abuse  pre- 
vention program  affected  how  many  times  par- 
ents discussed  the  topics  of  alcohol,  tobacco, 
and  illicit  drugs  with  their  children  in  the  6- 
month  period  following  completion  of  the  pro- 
gram. One  would  compare  the  mean  number  of 
discussions  for  a  sample  of  parents  who  attend- 
ed the  program  with  the  mean  number  of  dis- 
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cussions  for  the  control  group  of  parents  who 
did  not  attend  the  program.  Once  again,  if  a  dif- 
ference was  detected,  it  would  have  to  be  tested 
to  determine  if  it  is  statistically  significant. 

Means  comparisons  can  also  be  done 
for  three  or  more  groups.  Returning  to  our  ex- 
ample, let  us  say  the  three  groups  were  par- 
ents who  attended  three  or  four  sessions, 
parents  who  attended  one  or  two  sessions, 
and  parents  in  the  control  group.  Means  com- 
parison can  be  done  for  more  than  two  groups 
while  controlling  for  another  measure  (say, 
whether  it  was  a  single-parent  household). 
Both  of  these  approaches  would  require  a 
different  test  of  statistical  significance  for 
any  difference  that  might  be  detected. 


Correlational  or  regressional  approaches  indi- 
cate whether  one  variable  increases  when 
another  variable  either  increases  or  decreases — 
for  example,  whether  the  increase  in  a  student's 
exposure  to  antialcohol  messages  in  school 
results  in  a  decrease  in  the  incidence  of  alcohol 
constimption. 

Regressional  analysis  should  be  performed 
by  a  trained  evaluator.  It  should  also  be  noted 
that  the  results  of  regressional  analysis  can  indi- 
cate whether  a  positive  or  negative  correlation 
exists  between  variables  and  whether  a  statisti- 
cally significant  relationship  exists  between 
variables.  The  results,  in  and  of  themselves,  can- 
not support  a  conclusion  that  an  intervention 
caused  the  relationship. 


Time-Series  Analysis 

As  discussed  in  chapter  3,  a  time-series  research 
design  is  one  way  in  which  the  effects  of  an 
intervention  can  be  assessed.  The  type  of  statis- 
tical analysis  used  in  time-series  designs 
involves  examining  data  collected  before,  dur- 
ing (if  appropriate),  and  after  the  intervention. 
If,  for  example,  one  wanted  to  know  if  a  tempo- 
rary local  ordinance  banning  cigarette  vending 
machines  resulted  in  decreased  cigarette  sales, 
one  might  track  cigarette  sale  data  for  several 
months  before  the  enactment  of  the  ordinance, 
the  period  during  which  the  ordinance  was  in 
effect,  and  several  months  following  the  sus- 
pension of  the  ordinance.  A  sustained  difference 
in  sales  during — and  only  during — the  period 
of  intervention  allows  for  a  plausible  assertion 
of  causality.  Statistical  techniques  are  used  to 
determine  whether  observed  changes  over  time 
are  statistically  significant. 

If  the  collection  of  data  during  the  interven- 
tion (a  1-week  prevention  education  course,  for 
example)  seems  impractical,  data  could  then  be 
collected  for  an  extended  period  of  time  before, 
as  well  as  after,  the  intervention. 

Correlational  Approaches 

In  evaluation  research,  regressional  analyses  are 
quite  common.  These  types  of  statistical  analy- 
ses are  more  complex  than  those  previously  dis- 
cussed, and  they  focus  on  the  relationships 
between  variables  that  can  be  assigned  value. 


Advanced  Analyses 

There  are  other  forms  of  analysis  that  can  be 
appUed  in  the  evaluation  of  intervention  out- 
comes (e.g.,  analysis  of  variance,  analysis  of 
covariance).  Like  correlational  or  regressional 
analysis,  they  should  be  performed  by  an  evalu- 
ator trained  to  do  so. 

Multiple  Approaches  to  Data  Analysis 

Clearly,  if  one  intends  to  determine  the  effects  of 
an  intervention  on  a  group,  relational  analysis  is 
required.  Specifically,  this  analysis  should  cen- 
ter around  data  collected  at  the  start  of  the 
implementation  stage  through  the  stabilization 
stage  (and  possibly  afterward),  when  evalua- 
tion questions  focus  on  outcomes. 

In  light  of  limited  funding  for  evaluation 
and,  perhaps,  the  limited  availability  of  experi- 
enced program  evaluators,  it  is  reasonable  to 
ask  what  types  of  data  analysis  should  be 
employed.  As  has  been  stated  earlier  in  this 
guide,  multifaceted  research  designs  requiring 
multiple  approaches  to  data  analysis  work 
best.  Every  program  can  carry  out  useful 
descriptive  analyses  of  quantitative  data.  With 
the  guidance  of  an  experienced  researcher, 
means  comparisons  and  contingency  tables 
(two  elementary  forms  of  relational  analysis) 
are  certainly  possible.  More  sophisticated  rela- 
tional analyses  require  considerable  training 
and  experience. 
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Statistical  analysis  of  quantitative  data  is 
well  and  good,  but  it  uses  only  part  of  the  eval- 
ua tor's  collection  of  useful  tools.  Regardless  of 
whether  an  evaluation  plan  and  budget  allow 
for  rudimentary  or  highly  complex  statistical 
analysis,  the  collection  and  analysis  of  qualita- 
tive data  should  be  included  in  any  data  analy- 
sis plan. 

Approaches  to  Qualitative 
Data  Analysis 

Qualitative  data  contain  verbal  descriptions  of 
problems,  behaviors,  or  events  rather  than 
counts  or  rates  of  events  or  occurrences.  Quali- 
tative data  provide  detail,  context,  and  qualify- 
ing considerations  that  relate  to  the  problem 
being  addressed,  and  to  the  process  of  how  the 
problem  was  addressed. 

Qualitative  evaluators  are  among  the  first 
to  assert  that  qualitative  data  are  "very  messy." 
But,  even  though  qualitative  data  in  their  initial 
form  tend  to  be  verbose  and  for  the  most  part 
unstructured,  qualitative  research  remains  one 
of  the  most  valued  components  in  any  evalua- 
tion. Unlike  quantitative  approaches,  which 
begin  with  sharply  defined  categories  of  study 
and  measure  the  degrees  to  which  they  exist, 
qualitative  methods  begin  with  categories  but 
anticipate  that  the  meanings  behind  them  may 
vary  according  to  how  they  are  used,  as  well  as 
who  uses  them.  It  is  the  task  of  the  qualitative 
evaluator  to  discern  patterns  and  differences  in 
meaning  among  the  variety  of  perspectives  and 
to  hypothesize  about  the  causes  of  these  pat- 
terns and  differences. 

For  example,  a  survey  may  reveal  that  the 
vast  majority  of  the  respondents  believe  that 
young  people  use  alcohol  and  illicit  drugs 
because  they  have  no  attractive  alternative 
choices.  Through  qualitative  research,  one  can 
begin  to  understand  why  people  in  general  or 
segments  of  the  population  (e.g.,  whites,  blacks, 
others)  believe  this  to  be  true.  If  there  is  varia- 
tion among  responses,  the  qualitative  evaluator 
will  attempt  to  find  out  its  causes.  Patterns  dis- 
cerned from  interview  data,  for  example, 
should  then  be  compared  with  data  derived 
from  other  sources,  such  as  participant-observa- 
tion and  focus  groups.  Ultimately,  the  analysis 
may  reveal  contextual  issues  that  could  signifi- 


cantly affect  the  development  of  any  interven- 
tion plan  as  well  as  explain  the  success  or  fail- 
ure of  intervention  effects. 

The  remainder  of  this  chapter  briefly 
describes  the  basic  progression  of  qualitative 
data  analysis  in  most  evaluation  studies:  data 
transcription,  data  reduction,  and  conclusion 
drawing. 

Data  Transcription 

The  rawest  forms  of  data  are  field  notes  gener- 
ated through  participant-observation.  These 
notes  contain  the  evaluator 's  immediate  and 
initial  thoughts  about  occurrences  and  should 
be  elaborated  upon  further  in  writing.  For 
example,  if  a  program  evaluator  attends  a  series 
of  meetings  in  which  community  members 
draft  an  ordinance  to  restrict  alcohol  use  in  pub- 
lic settings,  field  notes  should  be  transcribed 
after  each  meeting.  The  more  time  that  elapses 
between  taking  the  field  notes  and  writing  them 
up,  the  greater  the  likelihood  that  the  evaluator 
will  not  recall  the  full  meaning  behind  the 
notes.  Qualitative  data  from  interviews,  focus 
groups,  and  archival  research  should  be  treated 
in  a  similar  manner. 

Data  Reduction 

Once  all  recorded  data  are  transcribed,  they  and 
other  data  are  organized  into  more  user-friend- 
ly formats  through  the  use  of  codes.  Qualitative 
researchers  often  begin  data  collection  with 
general  code  categories  based  on  research  ques- 
tions. After  initial  review  of  collected  data 
(through  data  transcription),  these  general 
codes  are  refined  in  order  to  facilitate  the  dis- 
cernment of  patterns  and  inconsistencies.  These 
revised  codes  are  then  assigned  to  "chunks"  of 
transcribed  data.  This  is  a  critical  first  step  in 
qualitative  data  analysis  because  it  involves 
simplifying  what  appears  initially  in  a  very 
unstructured  or  "messy"  form. 

For  example,  suppose  one  evaluation  ques- 
tion concerned  the  perceived  causes  of  sub- 
stance abuse-related  problems.  A  general  code 
category  for  collected  data  might  be  called 
"causes  of  problems."  After  initial  review  of 
data,  this  general  code  might  be  refined  to  con- 
tain several  subcode  categories,  such  as  "avail- 
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ability  of  substances"  and  "unemployment." 
These  codes  and  subcodes  are  then  assigned  to 
data.  Sometimes  it  may  be  relatively  easy  to 
assign  codes,  as  when  respondents  in  semistruc- 
tured  interviews  are  asked  to  comment  on  caus- 
es. Other  times,  pertinent  information  may  be 
embedded  in  other  types  of  commentary,  as 
when  focus  group  members  discuss  solutions  to 
substance-related  crime  (e.g.,  visible  police  pres- 
ence). In  such  instances,  data  "chunks"  may  be 
double  coded.  Multiple  coding  of  qualitative 
data  is  fairly  common. 

Data  Display 

Computers  have  greatly  facilitated  the  process 
of  assigning  codes  to  bits  of  data.  If  possible,  a 
body  of  data  that  has  been  transcribed  with 
common  word-processing  software  can  be  easi- 
ly adapted  into  a  "code  and  retrieve"  software 
program,  such  as  The  Ethnograph.  This  type  of 
program  is  also  capable  of  displaying  all  data 
chunks  that  are  associated  with  a  particular 
code  or  subcode.  Once  the  evaluator  is  able  to 
view  qualitative  data  in  this  manner  (either 
with  or  without  the  assistance  of  computer  soft- 
ware), the  final  process  in  qualitative  data 
analysis  (conclusion  drawing)  becomes  possible. 

Drawing  Conclusions 

Preliminary  conclusions  are  sometimes  used 
as  a  basis  from  which  the  evaluator  may  seek 
answers  to  a  limited  number  of  additional  ques- 
tions that  are  quite  focused.  For  example,  if  data 
from  several  research  activities  suggest  that  the 
nature  of  law  enforcement  contributes  to  the 
pervasiveness  of  substance  abuse-related  prob- 
lems, then  the  evaluator  may  seek  interview 
data  from  one  or  two  key  individuals  (e.g.,  a 
law  enforcement  official  or  beat  officer)  whose 
knowledge  about  the  topic  can  speak  directly  to 
the  plausibility  of  the  preliminary  conclusion. 
An  evaluator  might  also  conduct  additional 
archival  research  on  such  objective  measures  as 
the  number  of  assigned  officers,  response  times, 
and  basic  crime  statistics  for  the  target  area  as 
compared  with  a  larger  area  (e.g.,  town,  city, 
county).  On  the  basis  of  such  information,  a 
preliminary  conclusion  may  be  asserted  more 
definitively,  or  it  may  simply  be  presented  as  a 
popular  perception  that  is  only  somewhat  sub- 


stantiated, or  that  perhaps  is  refuted  by  other 
sources. 

The  process  of  verifying  preliminary  con- 
clusions may  not  always  require  a  return  to  data 
collection.  Sometimes,  a  program  evaluator 
might  share  preliminary  conclusions  with  pro- 
gram staff  or  original  data  sources  and  request 
feedback.  Or  an  evaluator  may  simply  return  to 
the  original  raw  or  coded  data.  In  any  event, 
verification  of  preliminary  conclusions  is  essen- 
tial in  any  qualitative  data  analysis  effort.  If  it  is 
not  performed  and  the  process  not  documented 
in  the  final  evaluation  report,  conclusions  will 
be  open  to  sharp  criticism — ^justifiably 

Summary 

This  chapter  reviewed  concepts  in  quantitative 
and  qualitative  data  analysis  that  should  be 
understood  by  all  prevention  planners. 
Although  many  program  budgets  may  not 
aUow  for  sophisticated  quantitative  analyses, 
every  program  can  still  conduct  useful  forms  of 
descriptive  analysis,  as  well  as  elementary 
forms  of  relational  analysis  on  quantitative  data. 

This  chapter  also  asserted  that  different 
types  of  analysis  are  better  suited  to  certain 
types  of  data  (i.e.,  data  collected  at  specific 
stages  in  a  program's  development).  Descrip- 
tive analysis  of  quantitative  data,  as  well  as 
most  qualitative  analysis,  is  likely  to  be  per- 
formed on  data  collected  during  the  early  phas- 
es of  an  intervention's  existence  (Initiation, 
Planning,  Pilot  Testing).  Relational  analysis  that 
is  oriented  toward  assessing  outcomes  is  better 
suited  to  data  from  later  stages  (Implementa- 
tion, Stabilization,  and  Dissemination). 

Since  most  prevention  interventions  are 
likely  to  be  involved  in  conducting  at  least  some 
degree  of  qualitative  research  during  the  course 
of  an  evaluation,  this  chapter  addressed  the 
basic  progression  of  qualitative  analysis.  If 
participant-observation  or  focus  groups  were 
included  in  the  evaluation  plan,  data  should  be 
tianscribed.  Data  should  then  be  coded  and 
analyzed  for  patterns  and  inconsistencies.  The 
initial  or  preliminary  conclusions  drawn  from 
qualitative  data  should  be  verified  in  some  fash- 
ion by  the  evaluator  in  order  to  produce  credi- 
ble final  conclusions. 
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Conclusions  and  Recommendations 


I  arrying  out  a  credible  and  useful  evalua- 

^^^  tion  is  not  easy.  Local  entities  responsible 
for  delivering  services  or  implementing  policy 
generally  do  not  employ  tn-house  evaluation 
staff.  Spending  scarce  resources  to  purchase 
evaluation  services  is  a  difficult  choice  for  pro- 
gram administrators  who  may  not  realize  just 
how  important  evaluation  is  to  developing 
effective  prevention  interventions  and  ensuring 
the  continuation  of  these  efforts.  But  the  need 
for  demonstrating  the  value  of  prevention  has 
never  been  so  critical  to  the  future  of  preven- 
tion programming  as  it  is  today.  This  guide 
was  designed  to  assist  collaborative  evaluator- 
practitioner  teams  in  designing  and  imple- 
menting useful  evaluations. 

Collaboration  among  evaluator,  prevention 
administrator,  and  intervention  staff  is  essential 
in  any  evaluation  effort.  Such  collaborations  are 
much  more  likely  to  ensure  the  development 
and  implementation  of  appropriate,  compre- 
hensive, and  useful  evaluations  than  traditional 
researcher-driven  evaluations.  At  a  time  when 
decreased  funding  for  prevention  has  spurred 
calls  to  document  intervention  effectiveness,  the 
ability  to  produce  useful,  high-quality  evalua- 
tions has  never  been  more  important. 

Critical  to  designing  useful  evaluations  is 
beginning  the  process  with  a  recognition  of  the 
type  of  intervention  approach.  This  information 
will  influence  not  only  how  the  evaluation 
should  be  designed  but  also  what  types  of  data 
should  be  collected.  Another  important  element 
in  any  evaluation  is  early  attention  to  an  inter- 
vention's stage  of  development.  Attempting  to 
assess  intervention  outcomes  during  the  early 


49 


A  Guide  for  Evaluating  Prevention  Effectiveness 


stages  of  implementation  is  not  only  inappro- 
priate, it  is  also  wasteful  of  precious  dollars. 
Process  evaluations  that  document  how  an 
intervention  fxmctions  before  it  achieves  out- 
comes in  later  stages  are  much  more  appropri- 
ate and  provide  invaluable  information. 

Any  formal  evaluation  plan  begins  with  the 
study  design.  Given  that  many  prevention 
efforts  are  multifaceted  in  nature,  the  use  of 
evaluation  plans  that  assess  intervention 
processes  as  well  as  outcomes  through  a  variety 
of  approaches  is  advised.  This  involves  incorpo- 
rating either  experimental  or  quasi-experimen- 
tal research  designs  in  order  to  be  able  to  make 
assertions  about  intervention  outcomes. 

Before  data  collection  begins,  several  practi- 
cal issues  should  be  addressed.  Among  the 
most  common  are  securing  the  support  and 
assistance  of  staff,  access  to  data,  and  the  con- 
sent of  participants,  as  well  as  ensuring  confi- 
dentiality to  participants  and  guaranteeing  that 
collected  data — often  quite  sensitive — is  main- 
tained securely.  Another  practical  issue  con- 
cerns providing  regular  feedback  on  the 
evaluation's  progress  and  preliminary  findings 
to  staff  and  other  constituencies.  Many  evalua- 
tors  fail  to  give  such  feedback,  and  this  is  the 
principal  reason  why  many  practitioners 
remain  skeptical,  or  even  hostile,  toward  evalu- 
ation. The  evaluation  process  is  often  an  intru- 
sion into  the  work  of  the  staff,  particularly 
when  they  have  been  asked  to  assist  with  data 
collection.  Yet  they  also  believe  that  the  evalua- 
tion provides  them  with  nothing  to  enable  them 
to  perform  their  duties  more  effectively.  Fre- 
quent feedback  can  overcome  this  understand- 
able resistance. 


Multifaceted  evaluation  designs  also 
require  the  use  of  a  variety  of  quantitative  and 
qualitative  data  collection  methods.  Evaluations 
might  include  surveys  and  interviews,  focus 
groups,  archival  research,  and  observations. 
Evaluator-practitioner  teams  should  consider 
all  these  techniques  in  light  of  their  appropriate- 
ness to  the  particular  intervention  and  their 
strengths  and  weaknesses. 

The  final  topic  in  this  guide  is  data  analysis. 
The  differences  among  several  types  of  descrip- 
tive and  relational  analyses  on  quantitative  data 
were  discussed.  Without  delving  into  the 
specifics  of  how  to  perform  complex  statistical 
analysis,  the  conceptual  underpinnings  of  these 
types  of  analysis  were  presented  in  the  belief 
that  prevention  administrators  should  be  able 
not  only  to  understand  the  results  of  data  analy- 
sis, but  also  to  ask  better  questions  about  the 
plans  for  data  collection  and  analysis  before  the 
evaluation  is  actually  implemented.  The  steps  in 
qualitative  data  analysis  were  also  discussed  to 
guide  the  transformation  of  volumes  of  qualita- 
tive data  into  accessible  and  useful  information. 

The  potential  benefits  of  employing  the 
recommendations  discussed  in  this  guide  are 
many.  Evaluation  will  become  an  ongoing, 
dynamic,  collaborative  process.  Evaluation 
expectations  will  be  clear  and  appropriate. 
Information  will  steer  program  development. 
All  prevention  activities  will  be  monitored  and 
evaluated  at  some  level,  but  the  most  costly  out- 
come evaluation  activities  will  be  reserved  for 
those  activities  that  are  truly  ready  for  outcome 
evaluation.  Using  this  structure  for  collabora- 
tive evaluation,  prevention  programs  can  expect 
to  strengthen  their  interventions  and  to  amass 
solid  evidence  of  their  effectiveness. 
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Archival  research:    Research  method  that 
involves  the  use  of  data  extracted  from  exist- 
ing written  or  computer  records. 

Attribution  of  effect:  Concept  that  refers  to 
the  ability  to  conclude  that  an  intervention 
caused  an  outcome. 

Comparison  group:    In  quasi-experimental 
evaluation  design,  a  group  of  evaluation  par- 
ticipants that  is  not  exposed  to  the  interven- 
tion. This  term  usually  implies  that 
participants  were  not  randomly  assigned,  but 
were  similar  to  the  intervention  group  mem- 
bers in  many  respects. 

Confidence  interval:     An  estimated  range  of 
values  derived  from  sample  statistics  with  a 
given  high  probability  of  covering  the  true 
population  value. 

Contingency  table:     A  form  of  relational  analy- 
sis that  classifies  observations  by  their  values 
on  two  or  more  variables.  Each  cell  in  the 
table  represents  a  unique  combination  of  val- 
ues across  the  variables,  and  each  observa- 
tion qualifies  for  one  and  only  one  cell  in  the 
table. 

Control  group:    In  experimental  evaluation 
design,  a  group  of  participants  that  is  essen- 
tially similar  to  the  intervention  group  but  is 
not  exposed  to  the  intervention.  Participants 
are  designated  to  be  part  of  either  a  control 
or  intervention  group  through  random 
assignment. 

Convenience  sample:    A  nonrandom  study 
sample  selected  on  the  basis  of  convenience 
and  accessibility. 


Correlational  analysis:     A  form  of  relational 
analysis  that  assesses  the  stiength  and  direc- 
tion of  association  between  variables. 

Data:    Information  collected  according  to  a 
methodology  and  through  specific  research 
methods  and  instruments. 

Data  analysis:    The  process  of  examining  sys- 
tematically collected  information. 

Descriptive  analysis:     Data  analysis  that 
results  in  information  that  characterizes  the 
sample,  such  as  measures  of  cential  tendency 
(e.g.,  mean,  median,  mode)  and  measures  of 
variability  (e.g.,  range,  standard  deviation, 
variance).  In  addition  to  describing  the  sam- 
ple, such  data  may  be  used  as  input  for  rela- 
tional analysis. 

Experimental  design:    A  methodology  for 
examining  intervention  outcomes  that 
involves  the  random  assignment  of  subjects 
to  intervention  and  contiol  conditions  with  a 
contioUed  manipulation  delivered  to  subjects 
in  the  intervention  group.  The  design  enables 
the  evaluator  to  conclude  that  the  outcomes 
were  caused  by  the  intervention.  See  also 
quasi-experimental  design. 

Face-to-face  questionnaire:    See  structured 
interview. 

Focus  group:    Qualitative  research  method 
that  involves  structured  discussion  among 
individuals  with  shared  characteristics. 

Individual-oriented  intervention:     An  inter- 
vention that  attempts  to  change  the  behavior 
of  individuals  by  enhancing  the  knowledge, 
attitudes,  skills,  and  beliefs  of  individuals. 
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Informed  consent:    The  written  permission 
obtained  from  research  participants  (or  their 
parents  if  participants  are  minors)  giving 
their  consent  to  participate  in  an  evaluation 
after  having  been  informed  of  the  nature  of 
the  research. 

Institutional  review  board:     A  group  of 

researchers  and  others  appointed  by  an  insti- 
tution to  assess  proposed  data  collection 
regarding  potential  harm  that  might  be 
caused  to  study  participants. 

Instrument:     Device  that  assists  evaluators  in 
collecting  data  in  an  organized  fashion,  such 
as  a  standardized  survey  or  interview  proto- 
col. 

Intermediate  outcome:    Intervention  outcome, 
such  as  changes  in  knowledge,  attitudes,  or 
beliefs,  that  occurs  before,  and  is  necessary 
for  changes  in,  substance  use  and  substance- 
related  problems.  See  also  long-term  out- 
come. 

Internal  validity:     Concept  that  refers  to  the 
ability  to  make  inferences  about  whether  the 
relationship  between  variables  is  causal  in 
nature  and,  if  it  is,  the  direction  of  causality. 

Intervention:     A  manipulation  applied  to  a 
population  in  order  to  change  behavior.  In 
substance  abuse  prevention,  intervention  at 
the  individual  or  policy  level  may  be  used  to 
prevent  or  lower  the  rate  of  substance  use  or 
related  problems.  See  also  policy  interven- 
tion; individual-oriented  intervention. 

Intervention  group:     In  experimental  and 
quasi-experimental  evaluation  designs,  the 
group  of  participants  that  is  exposed  to  the 
intervention.  See  also  control  group  and 
comparison  group. 

Long-term  outcome:     Intervention  outcome 
corresponding  to  the  prevalence  of  substance 
use  and  substance-related  problems. 

Maturation  effects:    Changes  in  outcomes  that 
are  attributable  to  participants'  growing 
older,  wiser,  stronger,  more  experienced,  and 
the  like,  solely  through  the  passage  of  time. 


Mean:    A  measure  of  central  tendency  comput- 
ed by  summing  over  all  the  values  of  a  vari- 
able and  dividing  by  the  number  of  cases 
(on  average). 

Means  comparison:    A  form  of  relational 
analysis  that  involves  comparing  the  average 
values  of  two  or  more  groups  to  see  if  they 
differ  more  than  would  be  expected  by 
chance. 

Measures  of  central  tendency:    Indices  that 
describe  the  "typical"  or  "average"  value — 
for  example,  the  arithmetic  average  and  the 
median. 

Median:     A  measure  of  central  tendency  refer- 
ring to  the  point  exactly  midway  between  the 
top  and  bottom  halves  of  a  distribution  of 
values. 

Methodology:     Procedure  for  collecting  data. 

Mode:    A  measure  of  central  tendency  refer- 
ring to  the  value  most  often  given  by  respon- 
dents. 

Nonrepresentative  sample:    A  segment  of  a 
larger  body  or  population  that  does  not  mir- 
ror in  composition  characteristics  of  the  larg- 
er body  or  population.  See  also 
representative  sample. 

Observation:     Data  collection  method  involv- 
ing unobtrusive  examination  of  behavior 
and /or  occurrences,  often  in  a  natural  set- 
ting, and  characterized  by  no  interaction 
between  participants  and  observers. 

Outcome  evaluation:    Evaluation  that  focuses 
research  questions  on  assessing  intervention 
effects  on  intended  outcomes.  See  also 
process  evaluation. 

Participant-observation:     Qualitative  research 
method  that  requires  simultaneous  participa- 
tion in  and  examination  of  activity  in  a  natur- 
al setting.  The  identity  of  the  evaluator  as  an 
evaluator  is  usually  made  known  to  others  in 
the  setting. 

Policy  intervention:     An  intervention  that 
attempts  to  change  the  behavior  of  individu- 
als by  changing  the  economic  or  regulatory 
environment  around  substance  use. 
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Process  evaluation:     Evaluation  that  focuses 
research  questions  on  the  nature  of  interven- 
tion implementation  or  its  structure  and 
operations.  See  also  outcome  evaluation. 

Qualitative  data:     Generally,  contextual  infor- 
mation in  evaluation  studies,  usually  describ- 
ing participants  and  interventions.  The 
strength  of  qualitative  data,  which  is  often 
presented  as  text,  is  its  ability  to  illuminate 
evaluation  findings  derived  from  quantita- 
tive methods.  See  also  quantitative  data. 

Quantitative  data:    Measures  that  in  evalua- 
tion studies  capture  changes  in  targeted  out- 
comes (e.g.,  substance  use)  and  intervening 
variables  (e.g.,  attitudes  toward  use).  The 
strength  of  quantitative  data  is  its  use  in  test- 
ing hypotheses  and  determining  the  strength 
and  direction  of  effects. 

Quasi-experimental  design:     A  plan  for  exam- 
ining intervention  outcomes  that  involves  an 
intervention  group  and  may  involve  a  com- 
parison group  and /or  preintervention 
and/or  postintervention  tests.  This  design 
does  not  involve  random  assignment  of  par- 
ticipants to  conditions. 

Questionnaire:    Research  instrument  that  con- 
sists of  written  questions,  each  with  a  limited 
set  of  possible  responses.  See  also  self- 
administered  questionnaire;  structured 
interview;  telephone  survey. 

Random  assignment:  The  process  through 
which  members  of  a  pool  of  eligible  evalua- 
tion participants  are  assigned  to  either  the 
intervention  group  or  a  control  group  on  a 
random  basis,  such  as  through  the  use  of  a 
table  of  random  numbers. 

Relational  analysis:    Data  analysis  that  reveals 
the  relationship  between  variables  consid- 
ered important  for  the  evaluation — for  exam- 
ple, correlational  analysis  and  regression 
analysis. 

Representative  sample:     A  segment  of  a  larger 
body  or  population  that  mirrors  in  composi- 
tion the  characteristics  of  the  larger  body  or 
population.  See  also  nonrepresentative  sam- 
ple. 


Sample:    A  segment  of  a  larger  body  or  popu- 
lation. 

Sample  attrition:     Unplanned  reduction  in  the 
size  of  the  study  sample  because  of 
participants'  dropping  out  of  the  evalua- 
tion— for  example,  because  of  relocation. 

Self-administered  questionnaire:     A  question- 
naire that  is  completed  by  the  respondent 
without  any  assistance  or  clarification  from 
the  evaluator. 

Self-selection:     An  occurrence  in  which  indi- 
viduals themselves  choose  to  participate  in  a 
program  or  become  a  member  of  a  sample 
without  the  control  of  the  evaluator. 

Semistructured  interview:     Qualitative  data 
collection  method  that  involves  an  interview- 
er and  specific  questions  with  unlimited 
response  options.  See  also  structured  inter- 
view; unstructured  interview. 

Simple  random  sample:     In  experimental 
research  designs,  a  sample  derived  from 
indiscriminate  selection  from  a  pool  of  eligi- 
ble participants,  such  that  each  member  of 
the  population  has  an  equal  chance  of  being 
selected  for  the  sample.  See  also  stratified 
random  sample. 

Snowball  sample:     A  nonrandom  sample  that 
is  composed  according  to  the  referrals  of  ini- 
tial sample  members  such  that  sample  mem- 
bers not  only  share  certain  common 
characteristics,  but  are  likely  to  be  familiar 
with  one  another.  Also  referred  to  as  chain 
sample. 

Standard  deviation:     A  unit  of  measure  of  vari- 
ability or  dispersion  characterizing  the  ten- 
dency for  observations  to  depart  from  central 
tendency.  Standard  deviation  and  its  square, 
variance,  reflect  how  accurately  the  central 
tendency  measures  (such  as  the  mean)  would 
describe  a  randomly  selected  observation. 

Statistical  significance:  A  term  referring  to  the 
strength  of  a  particular  relationship  between 
variables.  A  relationship  is  said  to  be  statisti- 
cally significant  when  it  occurs  so  frequently 
in  the  data  that  the  relationship's  existence  is 
probably  not  attributable  to  chance. 
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Stratified  random  sample:     In  experimental 
research  designs,  a  sample  group  derived 
from  indiscriminate  selection  from  different 
subsegments  of  a  pool  of  eligible  participants 
(e.g.,  men,  women).  See  also  simple  random 
sample. 

Structured  interview:     Quantitative  data  col- 
lection method  that  involves  an  interviewer, 
specific  questions,  and  limited  sets  of  possi- 
ble responses  to  each  question.  Sometimes 
referred  to  as  face-to-face  questionnaire. 
See  also  semistructured  interview;  unstruc- 
tured interview. 

Telephone  survey:     A  structured  interview 
conducted  over  the  telephone. 


Time-series  analysis:     A  form  of  data  analysis 
that  involves  examination  of  data  derived 
from  repeated  assessments  across  time. 

Unstructured  interview:     Qualitative  data  col- 
lection method  that  involves  an  interviewer 
and  given  questions.  Not  all  given  questions 
may  be  asked,  however,  and  additional  sub- 
stantive questions  (not  necessarily  questions 
for  clarification  purposes)  may  be  posed  by 
the  interviewer. 

Variable:     Factor  or  characteristic  of  the  inter- 
vention, participant,  and /or  context  that  may 
influence  or  be  related  to  the  possibility  of 
achieving  intermediate  and  long-term  out- 
comes. 
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